Delinking orphans when saving a detached entity graph

slavkok · March 26, 2023, 2:02am

Hi – I’m working on a web app. The items (plural because they’re a graph) a user sends may or may not already be in the database; they are deserialized from the payload, and at that point they’re detached.

More or less I can see that things work. My only gripe is that relationships with previous generation’s children don’t get broken. The only way I can tell they they were not part of the last payload is the version column. This is if I use saveOrUpdate().

If I try to merge into the session, and turn on orphan removal, and the use JPA persist(), I get the infamous exception about collections being new.

If I turn on orphan removal with saveOrUpdate(), it simply doesn’t work. I also have CascadeType.ALL on the OneToMany side.

How exactly can I make sure that OneToMany collections I (lazily or eagerly) load are actually the same thing I saved last time, without having to save them in the first place by rummaging through an existing graph?

Curiously, I see that Many-to-Many relationships do not act this way. You save without it, the relationship is broken. OneToMany still remain intact…

This is on 5.4.

Thanks everyone!

beikov · March 27, 2023, 7:40am

Please share the model and code that you are using, as well as exceptions that you are seeing.

slavkok · March 27, 2023, 5:22pm

Hi. Thanks for the quick response. I’m not getting any exceptions.

I’m not allowed to post the actual code, but I’ll give you a description.

All my relationships are bidirectional. All my FetchTypes are FetchType.LAZY, in both directions. I have no OneToOnes, but a bunch of OneToManys and ManyToManys. Cascades are ALL Parent->Child in OneToMany, and MERGE and PERSIST in ManyToMany on the dominant side. OneToManys are lists, ManyToMany are sets with FetchMode.JOIN. mappedBys are on the correct sides.

To save, I use

getHibernateSession().saveOrUpdate(obj);

One of the requirements is that in all cases, the graph I’m saving may be detached (freshly deserialized from web requests).

So basically, I need two cases to work, determined at runtime:

“Comprehensive” save of a graph. What I mean by that is that i want the graph posted exactly as is, and any existing relationships to objects not being saved broken. Basically, if I recursively load the graph subsequently, I want to get the exact same thing I saved. What I’m finding is that many-to-manys get broken as they should (entries removed from the join table), but one-to-manys do not. Some of the child foreign keys are non-nullable so orphan removal is desirable, but in my case, if I add it to the annotation, it does nothing.
“Non-comprehensive” or “merge” save of a graph. In this case, I only want associations added, but nothing broken and nothing deleted.

I can achieve both programmatically by recursing through graphs and deleting things or breaking associations in the former case, or in the latter, adding associations. But I suspect there’s a more idiomatic way to do both.

If you’d be so kind, we can chat on Slack or Google Meet or Zoom, and then I can be much more specific about the code itself.

Thanks again.

beikov · March 27, 2023, 9:35pm

I don’t know what you think is so special about your model that you can’t share an excerpt of it, but this is not how the community works. Usually, people don’t help you if you can’t provide enough details to make your problem comprehendible.
If your company forbids this sort of sharing fully, you will have to pay a consultant to help you.

You can use our test case(hibernate-test-case-templates/JPAUnitTestCase.java at main · hibernate/hibernate-test-case-templates · GitHub) to reproduce your scenario, and then, piece by piece try to remove unnecessary associations/fields yet still make sure it shows the problem until the model is minimal. You can even rename the models if you must.

slavkok · March 28, 2023, 2:02am

There’s absolutely nothing special about my model. But my company (Amazon) is very strict about sharing proprietary code. But ok, I’ll write up a model that demonstrates something similar.

Again, in a huge majority of cases, the entities first come detached, from a web request, and more often than not, some (a majority) of the graph already has counterparts in the database with the same identifiers.

Let’s say we have a model like this (and forgive me if the annotations aren’t syntactically verbatim, we used bytecode injection for consistency across all entities so I’m typing them by hand, but they are syntactically correct as injected):

@Getter
@Setter
@Entity
public class Town {

    @Id
    String name;

    @OneToMany(fetch=FetchType.LAZY, cascade=new CascadeType[] {CascadeType.ALL}, mappedBy="town")
    List<House> houses;

}

@Getter
@Setter
@Entity
public class House {

    @Id
    String name;

    @ManyToOne(optional = false, fetch=FetchType.LAZY)
    Town town;

    @ManyToMany(cascade=new CascadeType[]{CascadeType.PERSIST, CascadeType.MERGE}, fetch=FetchType.LAZY)
    @Fetch(FetchMode.JOIN)
    @JoinTable(name="housetenant", joinColumns=JoinColumn(name="house_id", referencedColumnName="name", inverseJoinColumns=JoinColumn(name="tenant_id", referencedColumnName="amznAlias"))
    Set<Tenant> tenants;

    @OneToMany(fetch=FetchType.LAZY, cascade=new CascadeType[] {CascadeType.ALL}, mappedBy="house")
    List<Appliance> appliances;

}

@Getter
@Setter
@Entity
public class Appliance {

    @Id
    String id;

    @ManyToOne(optional = false, fetch=FetchType.LAZY)
    House house;

    ApplianceType type;
}

@Getter
@Setter
@Entity
public class Tenant {

    @Id
    String amznAlias;

    @ManyToMany(mappedBy="tenants")
    @Fetch(FetchMode.JOIN)
    Set<House> houses;
}

In a nutshell, that’s it.

If doing a merge(), the many-to-many between house and tenant disappears if not included. The one-to-many between house and appliances doesn’t get removed no matter what I do (unless I programatically go and call session.delete() on them – in which case the Cascade.ALL works).

The basic question is this: how can I programatically, at run time, tell it to either 1) Save the graph but break all references from existing objects to other objects in the database that are not included in the request graph, or 2) How can I tell it to merge with whatever is in the database, breaking no references at all – including the many-to-many one between houses and tenants?

beikov · March 28, 2023, 6:57am

Thanks for the example. This makes it very simple for me to help you

The reason for what you are seeing is that House#tenants is the owning side of the relationship, which means that data on that side will determine the contents of the relationship.
So if you merge a House with empty tenants, then you’ll get all tenants removed.
House#appliances is the inverse or non-owning side, which is irrelevant for persistence of the relationship. Only by changing Appliance#house you can alter the relationship. It is important to keep both sides of the relationship in sync, but if you know what you are doing (caching, persistence context operations), you can also skip maintaining the inverse/non-owning side, since maintaining that might require initialization which is potentially expensive.

The basic question is this: how can I programatically, at run time, tell it to either 1) Save the graph but break all references from existing objects to other objects in the database that are not included in the request graph

I guess your problem is the removal of “orphaned” elements from the inverse/non-owned side e.g. Town#houses and House#tenants? If so, then I fear there is no easy answer. To unlink such relationships, you will have to load the relationships, set the owning side to e.g. null or remove the entity and remove the element from the collection.

How can I tell it to merge with whatever is in the database, breaking no references at all – including the many-to-many one between houses and tenants?

You can load the data first and apply the changes to the entity graph relationship by relationship in this case e.g.

Town existingTown = entityManager.find(Town.class, requestTown.getId());
if ( !requestTown.getHouses().isEmpty() ) {
    for (House h : existingTown.getHouses()) {
        if ( !requestTown.getHouses().contains(h)) ) {
            h.setTown(null); // or entityManager.remove(h);
        }
    }
    existingTown.getHouses().addAll(requestTown.getHouses());
}
...

This can become really cumbersome, but there is no easy way to handle this otherwise when you want to manage an unowned collection. If you make the relationship unidirectional instead, i.e. remove House#town and map Town#houses with @JoinColumn, then you could handle all of this through the collection instead, but lose the querying capability of House by town.

If you are willing to use DTOs instead, I think this is a perfect use case for Blaze-Persistence Entity Views, which handles all of this transparently for you. Relationship ownership doesn’t matter to Entity Views with respect to updatability. It takes care of the nifty details to make this work fast.

I created the library to allow easy mapping between JPA models and custom interface or abstract class defined models, something like Spring Data Projections on steroids. The idea is that you define your target structure(domain model) the way you like and map attributes(getters) via JPQL expressions to the entity model.

A DTO model for your use case could look like the following with Blaze-Persistence Entity-Views:

@EntityView(Town.class)
@UpdatableEntityView
public interface TownDto {
    @IdMapping
    String getName();
    @UpdatableMapping(orphanRemoval = true)
    Set<HouseDto> getHouses();

    @EntityView(House.class)
    @UpdatableEntityView
    @CreatableEntityView
    interface HouseDto {
        @IdMapping
        String getName();
        void setHouse(String name);
        @UpdatableMapping(orphanRemoval = true)
        Set<TenantDto> getTenants();
        @UpdatableMapping(orphanRemoval = true)
        Set<ApplianceDto> getAppliances();
    }
    @EntityView(Tenant.class)
    interface TenantDto {
        @IdMapping
        String getAmznAlias();
    }
    @EntityView(House.class)
    @UpdatableEntityView
    @CreatableEntityView
    interface ApplianceDto {
        @IdMapping
        String getId();
        void setId(String id);
        ApplianceType getType();
        void setType(ApplianceType type);
    }
}

Querying is a matter of applying the entity view to a query, the simplest being just a query by id.

TownDto a = entityViewManager.find(entityManager, TownDto.class, id);

The Spring Data integration allows you to use it almost like Spring Data Projections: Blaze Persistence - Entity View Module

Page<TownDto> findAll(Pageable pageable);

TownDto save(TownDto town);

The best part is, it will only fetch the state that is actually necessary!

slavkok · March 28, 2023, 1:41pm

Hi Christian –

first, I can’t express enough how impressed I am with your answer and the time you took to compose it. Many, many thanks. Things are starting to click in my slow brain.

I suppose it’s counter-intuitive to me that the child is the owning side in a OneToMany. Yes, it owns the FK, but don’t parents generally decide when to kick children out of their houses in real life?

Here’s, then, what confuses me: Why do we put cascades on the non-owning side in a OneToMany, but on the owning side in the ManyToMany?

So it seems to me that if I reversed the owning sides in my ManyToManys, I would at least get one of my use-cases covered, and that is “merge with what’s in the db and don’t break any relationships when saving graph root entities”. So, tenants would own houses and not the other way around. If I do that, should cascades (PERSIST/MERGE) still be on the side of the House, or should they also move to Tenant?

Do I have that right?

If that’s the case, then I only have to programmatically break relationships in one of my use cases, without having to add them in the other.

And yes, it does seem that ultimately, I’ll have to implement what you suggest with Blaze-Persistence Entity Views. Alas, my example is a drop in the bucket compared to the monster meta-model I have to work with, and I don’t have the time to make such a deep change in my current development cycle. But I will most definitely move in that direction in the near future.

Much gratitude once again.

beikov · March 28, 2023, 2:54pm

Here’s, then, what confuses me: Why do we put cascades on the non-owning side in a OneToMany, but on the owning side in the ManyToMany?

Cascades allow you to propagate changes i.e. if you persist/merge a House, then you usually want that the Appliance objects associated with it to be flushed as well, even if the relationship owner is on the Appliance side. For a one-to-many, this just means that the flushing of Appliance, which happens due to the cascade, will take care of the relationship House#appliances, by flushing the Appliance#house association.

So it doesn’t really matter which side you put cascades on. You do that according to your needs.

So it seems to me that if I reversed the owning sides in my ManyToManys, I would at least get one of my use-cases covered, and that is “merge with what’s in the db and don’t break any relationships when saving graph root entities”. So, tenants would own houses and not the other way around. If I do that, should cascades (PERSIST/MERGE) still be on the side of the House, or should they also move to Tenant?

I don’t know your model, but only do cascades for relationships where it makes sense, as cascades affect what is loaded during a call to e.g. EntityManager.merge

slavkok · March 28, 2023, 3:36pm

Ok, one more question and I promise I’ll stop pestering you.

Let’s concentrate on the ManyToMany between House and Tenant. Let’s say we have the following set of requirements (all objects are initially detached, and in the database, we have some houses and some tenants):

If I save a house with no tenants, I want all the tenants in the database associated to that house to remain associated;
If I save a house with a tenant that happens to exist in the database and I change some property of that tenant, I want the property change to be persisted in the database, while all the other tenants remain associated and unchanged;
If I save a tenant without a house, and both exist in the database and are associated, I don’t want that association to be broken either (this is the least important requirement);
Saving a house with a single tenant that doesn’t exist in the database should add that tenant to whatever collection of tenants are already associated with that house.

So basically, I want property changes to be propagated downstream House->Tenant. Associations should always remain the same, and if they need to be broken, I’ll do so programmatically by manipulating relevant collections.

Who should be the owner, and where and what should the cascades be?

beikov · March 29, 2023, 7:20am

This is something that I call an “additive patch” and there is no way to model this with plain Hibernate ORM. You will have to implement this like I described before, load the existing entity and apply the changes from the request object to that. Blaze-Persistence Entity-Views supports this sort of model though

slavkok · March 29, 2023, 1:51pm

Thanks Christian.

I was able to implement everything I needed programmatically, in addition to of course the native Hibernate mappings. In most cases I didn’t need to unproxy entities so it’s working fairly efficiently.

As soon as I’m done with my current delivery, I’ll look into implementing Blaze-Persistence Entity Views.

Thanks a whole bunch for all your help!

Topic		Replies	Views
Orphanremoval dont work with merge priori Hibernate ORM	6	996	June 14, 2023
OneToMany association with Lazy Loading and Cascade.ALL Hibernate ORM	11	16702	November 24, 2019
Combination of @OnRemove and ordinary cascading Hibernate ORM	1	248	August 21, 2023
Collection cascade with detached entities Hibernate ORM	4	439	May 27, 2024
Hibernate Save_update - cascade Type Hibernate ORM	1	995	February 1, 2019

Delinking orphans when saving a detached entity graph

Related topics