Best practice to manage CRUD of @OneToMany

I am currently working on a system, that can have 100k customers.
Each customer, usually has 1-10k Endcustomers, but could have up to 1 million endcustomers, in the extreme case.

And EndCustomerEntity has two lazy @OneToMany relations.

public class EndCustomerEntity  {

    @Id
    @JdbcTypeCode(SqlTypes.CHAR)
    @Column(length = 36)
    @GeneratedValue
    public UUID id;

    @Column(name = "ext_id")
    public String externalId;

    @Column(name = "first_name")
    public String firstName;

    public String email;

    @OneToMany(mappedBy = "endCustomerEntity", cascade = CascadeType.ALL, orphanRemoval = true)
    /*TODO check orphanRemoval = true*/
    public List<EndCustomerFieldEntity> endCustomerFields = new ArrayList<>();

    @OneToMany(mappedBy = "endCustomerEntity", cascade = CascadeType.ALL, orphanRemoval = true)
    public List<EndCustomerLabelEntity> endCustomerLabels = new ArrayList<>();

    @Column(name = "created_at")
    @CreationTimestamp
    public Instant createdAt;

    @Column(name = "updated_at")
    @UpdateTimestamp
    public Instant updatedAt;

    @Version
    @Column(columnDefinition = "integer DEFAULT 0", nullable = false)
    public long version;
 ...
}

One EndCustomerEntity can have a maximum of 30 fields and labels (limited for performance reasons)
On a side note - you can search / filter by the fields and labels also.

Adding and updating the endCustomer fields and labels is complex, the Entity alone had 165 lines of code
(without the add/update logic for the fields and labels just 82 lines)
So one goal of me would be to simplify/shorten the Entity.

However, what truly brought me to these thoughts is the fact that for an addon that I am working on,
I want to be informed when a field / label is added, updated or removed.

GPT-4 told me there are event listener or interceptors, further there are some related annotations from JPA, it seems.
But I am unsure if using that is really the best (readable/maintenable/performant/straight forward) approach,
and then I wanted to somehow refactor the entity anyway as written above.

So now I started to create dedicated FieldService and LabelServices -
but that means someone could add an Endcustomer with fields and labels missing.
The fields and labels cannot live on their own without the Endcustomer, so this feels wrong / dangerous / not properly encapsulated.

So what to do (list is without any preference order)

  1. Keep the Entity as is and use event listener or interceptors or something like that to be informed of changes
  2. Keep the @OneToMany lists in the Endcustomer Entity, but manage those elements in a dedicated service
  3. Remove the @OneToMany lists from the Endcustomer Entity, merge the 3 entities in the business domain only
  4. Move the Management of the @OneToMany lists into static Mappers / Factories that are called from the Entity to simplify the Enity, but use listener/interceptors to be informed of changes
  5. Move the Management of the @OneToMany lists into static Mappers / Factories, call them from an Endcustomer ORM Service / Repository
    … many other possibilities.

What is the best practice here / What would you recommend me to do?

Many thanks in advance :pray: for any thoughts on this.

Some more thinking - brought me to this solution:

Moving the code into the Repository might be a solution?

Since anyone making any ORM-related change to the EndCustomerEntity would usually go over the repository, that would be an “atomic” solution I guess, or close to it.
Also, The repository would allow me to add some instance service to listen to any changes to the fields / labels.

To keep the code in the repository minimal, I could calculate the remove/update/add operations
for fields and labels in a Factory. Then based on the “EndCustomerFieldUpdate” I could then execute the necessary ORM operations either by delegating to the Field/Label Repositories directly - but that would make the EndCustomerRepository a compound repository - unsure if good - or I could directly execute the changes on the underlying entityManager, that should also work, I guess.

However, the EndCustomerRepository itself is currently at 292 lines of code, so that could probably also use some further refactoring to be shorter/simpler.

Anyway, here’s my rough idea (in this case using nested Repositories)

public class EndCustomerRepository implements PanacheRepositoryBase<EndCustomerEntity, UUID> {
    private final EndCustomerFieldRepository endCustomerFieldRepository;
    private final EndCustomerLabelRepository endCustomerLabelRepository;

    private EndCustomerEntity updateEntity(EndCustomerEntity existingEntity, EndCustomer updatedEndCustomer) {
        EndCustomerFieldUpdate endCustomerFieldUpdate = EndCustomerFieldUpdateFactory.updateEndCustomerFields(existingEntity.id, existingEntity.endCustomerFields, updatedEndCustomer.endCustomerFields());

        removeFields(endCustomerFieldUpdate.fieldsToRemove());
        updateFields(endCustomerFieldUpdate.fieldsToUpdate());
        addFields(endCustomerFieldUpdate.fieldsToAdd());

        return existingEntity.update(updatedEndCustomer.firstName(), updatedEndCustomer.email(), updatedEndCustomer.status());
    }

WDYT?

Third idea (just trying to help you help me ;).

public class EndCustomerRepository implements PanacheRepositoryBase<EndCustomerEntity, UUID> {
    private final FieldOrmService fieldOrmService;
    private final LabelOrmService labelOrmService;

    public EndCustomerRepository(FieldOrmService fieldOrmService, LabelOrmService labelOrmService) {
        this.fieldOrmService = fieldOrmService;
        this.labelOrmService = labelOrmService;
    }

    public EndCustomerEntity updateEntity(EndCustomerEntity existingEntity, EndCustomer updatedEndCustomer) {
        fieldOrmService.updateFields(existingEntity.id, existingEntity.endCustomerFields, updatedEndCustomer.endCustomerFields());
        labelOrmService.updateLabels(existingEntity.id, existingEntity.endCustomerLabels, updatedEndCustomer.endCustomerLabels());
        
        return existingEntity.update(updatedEndCustomer.firstName(), updatedEndCustomer.email(), updatedEndCustomer.status());
    }
}

Having the repository call orm related services (or DAOs?) for the field and label list updates. This way, keeping the code related to fields and labels minimal, but still keeping an update to an EndCustomerEntity “atomic”, in that it will always update all 3 entities in one go. Transactions I always put at the outermost boundary, e.g. the Rest resource, so I am not speaking of atomicity in the sense of transactions, that part is ensured, just a logical atomicity.

If you want to search based on the fields and labels, I would recommend you to use a JSON representation instead, as modern databases support indexes on top of these datatypes that allow efficient searching without having to join. For modelling this, you’d need to upgrade to Hibernate 6.2 though, or use the Hibernate Types project.

You can then use the various json functions of you database for querying e.g. where json_value(myJsonField, '$.'||:fieldName) = :fieldValue

1 Like

I am in fact using Hibernate 6.2 as I am using Quarkus 3.1.

However, as a database I am still using MariaDB, and to my knowledge, unlike Postgres it does not yet properly support JSON (yes, it has some support, but not fully indexed or whatever AFAIK).

Also, Reimplementing the fields at this point to json, based on the fact that I have just worked on extended filters based on the fields and labels as written in the other post - and given the fact this is just a side step of a step step of another big feature, is too much at this point I think.

So back to the original question - what’s the recommended design for @ManyToOne assocations?
Inside the entity seems the most simple, but it made my Entity long and ugly.

I have now implemented a version with One Quarkus Panache Repository delegating the updates to the field and label repositories. This feels wrong too, however - and I was only able to make it work by:

  1. Persist and flush on the parent Entity,
  2. Persist and flush on the two Child Entities after the update
  3. refresh again on the parent Entity so I can actually see and return the change I just did.

That feels very much “hacked” and wrong too :(.

  1. Is there any good/recommended way to manage ManyToOne relations outside the parent entity?
  2. Assuming there is not, I started mow implementing an annotation based solution to be informed of any value canges after the commit, in this way:
@Entity
public class EndCustomerFieldEntity extends AbstractEntity {

    @ManyToOne(fetch = FetchType.LAZY)
    @JoinColumn(name = "endCustomer_id")
    public EndCustomerEntity endCustomerEntity;

    @Column(name = "value")
    public String value;

    @Transient
    String originalValue;

    @PostLoad
    public void onPostLoad() {
        originalValue = value;
    }
    [...]
}

public class EndCustomerFieldEntityListener {

    @Inject
    Event<EndCustomerFieldEntityChange> endCustomerFieldEntityChangeEvent;

    @PreUpdate
    private void beforeUpdate(EndCustomerFieldEntity endCustomerFieldEntity) {
        if (!Objects.equals(endCustomerFieldEntity.originalValue, endCustomerFieldEntity.value)) {
            EndCustomerEntity endCustomerEntity = endCustomerFieldEntity.endCustomerEntity;
            EndCustomerFieldEntityChange event = EndCustomerFieldEntityChange.of(
                    endCustomerEntity.id,
                    endCustomerFieldEntity.originalValue,
                    endCustomerFieldEntity.value);

            endCustomerFieldEntityChangeEvent.fireAsync(event);
        }
    }
}

@ApplicationScoped
public final class MyServiceTrigger {

    public void handleEndCustomerFieldEntityChange(@ObservesAsync EndCustomerFieldEntityChange event) {

    }

Even if it doesn’t have direct indexing support yet, it already supports the JSON functions: Differences between JSON_QUERY and JSON_VALUE - MariaDB Knowledge Base

According to this blog post by the MariaDB devs, it is possible to use virtual columns for indexing: Using JSON in MariaDB | MariaDB

So back to the original question - what’s the recommended design for @ManyToOne assocations?
Inside the entity seems the most simple, but it made my Entity long and ugly.

You asked for recommendations, so I gave you one.

However, what truly brought me to these thoughts is the fact that for an addon that I am working on,
I want to be informed when a field / label is added, updated or removed.

Then use @PostPersist/PostUpdate and @PostRemove entity lifecycle listeners for the field/label entities like your EndCustomerFieldEntityListener solution already seems to be doing. The alternative is to use the Hibernate Interceptor API which has access to old and new state and similar callback methods.

You asked for recommendations, so I gave you one.

Sure, and thanks :pray:

But things are never easy :wink:

Besides mariadb, and the fact that there is a lot of code and functionality involved, Field values can be added / updated / deleted any time, and I guess updating a json model would not so simple.

That aside, I still wonder if you’d favour having the ManyToOne update logic inside of the parent entity - about 80 lines of ugly stream based, condensed code for the management of 2 ManyToOne relations,
Or moving that into 2 dedicated services, or moving it into 2 dedicated Repositories, and how to manage this relation, and how do you feel about te need to flush and refresh manually - is that considered ok, or rather not?

Regarding the notification - so do you recommend to use JPA Entity Listener or the Hibernate Interceptor API?
To me it is not clear yet which one is better for the given case.

Besides mariadb, and the fact that there is a lot of code and functionality involved, Field values can be added / updated / deleted any time, and I guess updating a json model would not so simple.

Modelling this as JSON is as simple as replacing List<EndCustomerFieldEntity> endCustomerFields with @JdbcTypeCode(SqlTypes.JSON) Map<String, String> endCustomerFields and List<EndCustomerLabelEntity> endCustomerLabels with @JdbcTypeCode(SqlTypes.JSON) List<String> endCustomerLabels.

With a Hibernate Interceptor you will get the old and the new state automatically, so the JSON approach would be very easy to implement. Copy old and new collections and call retainAll on the copies against the other state, then you end up with what was added/removed and can do your business logic.

That aside, I still wonder if you’d favour having the ManyToOne update logic inside of the parent entity - about 80 lines of ugly stream based, condensed code for the management of 2 ManyToOne relations,
Or moving that into 2 dedicated services, or moving it into 2 dedicated Repositories, and how to manage this relation, and how do you feel about te need to flush and refresh manually - is that considered ok, or rather not?

I still don’t understand what the purpose of extracting added/removed elements is. Could you please explain why you want to do that in the first place?

Regarding the notification - so do you recommend to use JPA Entity Listener or the Hibernate Interceptor API?

I’d recommend the Interceptor API to avoid having to manage the original state yourself.

1 Like

Please help me to the light, I don’t see it yet so simple ;).

So first of all I have these complex filters that I would all need to refactor:

Then I’d need to adjust the entities, the database tables, migrate staging and PROD, and find a way to ensure referential integrity in a performant / good to maintain way, as well as update the logic to allow adding / updating / deleting fields and labels, adjust my integration tests.

My estimate would be 1-2 weeks part-time for all that.

Besides, here’s the reason why I’d want to move the logic out of the entity:

  1. Class is generally too long. Extracting it would allow me to better break it down.
    I coculd try to mitigate that by calling some static helper classes / methods from the entity instead.

  2. My original attempt was not to use any hibernate /JPA related (annotation based) logic,
    and just have the 2 service classes manually calling another service that they have autowired, when a field is changed. The advantage would be - less magic - and less depending on Hibernate. As a result, easier to understand, test, and maintain.

But I guess in the best case I could just see that there is an attempt to change a field, I would, without annotations not see if the value was truly changed, and I would not have a way to react AFTER the transaction is committed - because it may happen that I’d inform about the field changes, when a rollback happens, that would at a minimum cause the trigger to be executed twice.

So it seems I have to use some Hibernate/JPA Annotation based “magic” solution anyway… and that removes a bit of the argument of removing the logic from inside the parent entity and just keeps the need to make it simpler. Maybe I should instead create a static Factory that would calculate the new state which then could be applied to the child entities in a simpler way.

Regarding Hibernate Interceptors, GPT-4 tells me, in order to be informed of any changes AFTER the commit, I would have to use the `PostCommitUpdateEventListener, which is part of the Hibernate SPI and therefore might not be stable.

Woah sorry I am a bit lost atm :wink:

Thanks for your time :pray:. Grateful.

Another reason why’d prefer a service based approach -

I will need several “action based” methods -

like based on an event - add label, remove label, or add field, remove field, update field… and so on.

That could be added to the EndCustomerRepository, which is too large already anyway, and it would require to look up the end customer first, so would add lots of extra complexity - when all I truly want is just adding a field or label actually…

Update - GPT says - While Hibernate Interceptors provide various hooks into the lifecycle of an entity, they are designed to be invoked before a transaction is committed. They don’t provide an easy way to be notified after a transaction has been successfully committed -

But that’s exactly what I would want to do - only when the transaction is committed I can be sure the Entity is truly added / updated / deleted.

Finally I re-realized why a FieldService would be much nicer then any Annotation based solution -
I could just normally hook my business service into the existing transaction - and then either the field is successfully commited - AND my action is sucessfully performed (also a commit) or both is rolled back, so I get atomicity.

Glad you found a way forward. I’ll consider this topic done as it’s very hard to follow for me already. If you need more help, please ask further questions in separate topics and try to boil the question part down to a single sentence :slight_smile: