PreInsertEventListener receives incomplete copy of entity

We recently migrated away from Session#saveOrUpdate as it is listed as deprecated, and because we wanted to depend only on the JPA API.

As such, we implemented a persist-or-merge strategy, similar to:

	if (isNew(entity)) {
		entityManager.persist(entity);
	} else {
		entityManager.merge(entity);
	}

With this new implementation everything sort of works, except for some custom Hibernate event listeners.
We have some PreInsertEventListeners that validate invariants for entities, and may veto changes. Since switching to the persist-or-merge strategy, these event listeners appear to be receiving incomplete copies of the payload entity.

This apparently happens due to the way cascade interacts with DefaultMergeEventListener. The method creates a copy of the entity (1), and proceeds to copy the relation values from the parent (2). It then saves the transient (half-created) entity (3), which triggers our custom listeners. Only after this call completes will the entity be completed by copying the missing relations (4).

		final Object copy = copyEntity( copyCache, entity, session, persister, id ); //1
		super.cascadeBeforeSave( session, persister, entity, copyCache );
		copyValues( persister, entity, copy, session, copyCache, ForeignKeyDirection.FROM_PARENT ); //2
		saveTransientEntity( copy, entityName, event.getRequestedId(), session, copyCache ); //3
		super.cascadeAfterSave( session, persister, entity, copyCache );
		copyValues( persister, entity, copy, session, copyCache, ForeignKeyDirection.TO_PARENT ); //4

I think it is also important to mention that while Session#saveOrUpdate is deprecated, it still works for the time being, and if I switch back to it, this behavior is not observed — everything works as expected.

I’m not sure what I’m missing here. Instinctively it seems strange to me that an event listener would receive an half-completed copy of the entity as the payload, but I might be misunderstanding the APIs involved. Any ideas or clarification would be welcome!

Thank you

As a first note, you don’t need to explicitly check if an entity is “new” and decide between calling persist or merge, the merge operation will inherently do that for you and result in an insert for transient entities and in an update for already managed ones.

I’m not sure what you mean here with “after this call completes will the entity be completed by copying the missing relations”: are you talking about to-one or to-many associations? If you could share an example mapping with the result you would expect vs what you’re actually getting in the pre-insert event listener it would definitely help in understanding what’s going on.

Also, have you tried using JPA’s native @PrePersist lifecycle callback method in the entity class instead?

While I understand there is no need to check if an entity is new, we do it nonetheless because that way we avoid an extra SELECT that the merge call would do. Since we have a reliable way to determine if an entity is new, so we use that mechanism to optimize that path. This is also the way Spring Data does it.

The code fragment I attached before comes from Hibernate’s DefaultMergeEventListener. You can see there that entity values are copied in multiple steps.

I hope this simplified version of the entity model demonstrates the issue:

@Entity
public class Travel {
    @OneToMany(mappedBy = "travel", orphanRemoval = true, fetch = FetchType.LAZY, cascade = [CascadeType.ALL])
    Set<ExpenseEntry> expenses = new HashSet<>();
}

@Entity
public class ExpenseEntry {
    @ManyToOne(cascade = [CascadeType.DETACH], fetch = FetchType.LAZY)
    Travel travel = null;

    @ManyToMany(fetch = LAZY, cascade = [CascadeType.ALL])
    Set<FileData> supportingFiles = new HashSet<>();
}

@Entity
public class FileData {
    // contents irrelevant
}

Travel travel = loadTravel();
ExpenseEntry expense = new ExpenseEntry();
expense.travel = travel;
expense.supportingFiles.add(new FileData());
travel.expenses.add(expense);
save(travel);

With the above example, a PreInsertEventListener looking for entities of type ExpenseEntry will see such entities with the supportingFiles collection empty, and it should contain one FileData.

I also created a @PrePersist callback function, which exhibits the same behavior (supportingFiles comes up empty when the callback function is called). The JPA lifecycle callbacks are invoked slightly earlier than the registered PreInsertEventListeners, that is, after FROM_PARENT relations are copied, but before TO_PARENT relations are copied.

JPA callback call stack:

 	at dao.ExpenseEntry.pre(ExpenseEntry.kt:163)
 	at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
 	at java.base/java.lang.reflect.Method.invoke(Method.java:580)
 	at org.hibernate.jpa.event.internal.EntityCallback.performCallback(EntityCallback.java:50)
 	at org.hibernate.jpa.event.internal.CallbackRegistryImpl.callback(CallbackRegistryImpl.java:123)
 	at org.hibernate.jpa.event.internal.CallbackRegistryImpl.preCreate(CallbackRegistryImpl.java:72)
 	at org.hibernate.event.internal.AbstractSaveEventListener.performSave(AbstractSaveEventListener.java:196)
 	at org.hibernate.event.internal.AbstractSaveEventListener.saveWithGeneratedId(AbstractSaveEventListener.java:134)
 	at org.hibernate.event.internal.DefaultMergeEventListener.saveTransientEntity(DefaultMergeEventListener.java:378)
 	at org.hibernate.event.internal.DefaultMergeEventListener.entityIsTransient(DefaultMergeEventListener.java:298)
 	at org.hibernate.event.internal.DefaultMergeEventListener.merge(DefaultMergeEventListener.java:211)
 	at org.hibernate.event.internal.DefaultMergeEventListener.doMerge(DefaultMergeEventListener.java:146)
 	at org.hibernate.event.internal.DefaultMergeEventListener.onMerge(DefaultMergeEventListener.java:130)
 	at org.hibernate.event.service.internal.EventListenerGroupImpl.fireEventOnEachListener(EventListenerGroupImpl.java:138)
 	at org.hibernate.internal.SessionImpl.fireMerge(SessionImpl.java:868)
 	at org.hibernate.internal.SessionImpl.merge(SessionImpl.java:839)
 	at org.hibernate.engine.spi.CascadingActions$6.cascade(CascadingActions.java:253)
 	at org.hibernate.engine.spi.CascadingActions$6.cascade(CascadingActions.java:243)
 	at org.hibernate.engine.internal.Cascade.cascadeToOne(Cascade.java:517)
 	at org.hibernate.engine.internal.Cascade.cascadeAssociation(Cascade.java:439)
 	at org.hibernate.engine.internal.Cascade.cascadeProperty(Cascade.java:224)
 	at org.hibernate.engine.internal.Cascade.cascadeCollectionElements(Cascade.java:551)
 	at org.hibernate.engine.internal.Cascade.cascadeCollection(Cascade.java:481)
 	at org.hibernate.engine.internal.Cascade.cascadeAssociation(Cascade.java:442)
 	at org.hibernate.engine.internal.Cascade.cascadeProperty(Cascade.java:224)
 	at org.hibernate.engine.internal.Cascade.cascade(Cascade.java:157)
 	at org.hibernate.event.internal.DefaultMergeEventListener.cascadeOnMerge(DefaultMergeEventListener.java:644)
 	at org.hibernate.event.internal.DefaultMergeEventListener.entityIsPersistent(DefaultMergeEventListener.java:278)
 	at org.hibernate.event.internal.DefaultMergeEventListener.merge(DefaultMergeEventListener.java:214)
 	at org.hibernate.event.internal.DefaultMergeEventListener.doMerge(DefaultMergeEventListener.java:146)
 	at org.hibernate.event.internal.DefaultMergeEventListener.onMerge(DefaultMergeEventListener.java:130)
 	at org.hibernate.event.internal.DefaultMergeEventListener.onMerge(DefaultMergeEventListener.java:84)
 	at org.hibernate.event.service.internal.EventListenerGroupImpl.fireEventOnEachListener(EventListenerGroupImpl.java:127)
 	at org.hibernate.internal.SessionImpl.fireMerge(SessionImpl.java:847)
 	at org.hibernate.internal.SessionImpl.merge(SessionImpl.java:833)

Hibernate PreInsertEventListener call stack:

 	at events.ExpenseEntryEventListener.onPreInsert(ExpenseEntryEventListener.kt:22)
 	at org.hibernate.action.internal.EntityIdentityInsertAction.preInsert(EntityIdentityInsertAction.java:186)
 	at org.hibernate.action.internal.EntityIdentityInsertAction.execute(EntityIdentityInsertAction.java:75)
 	at org.hibernate.engine.spi.ActionQueue.execute(ActionQueue.java:670)
 	at org.hibernate.engine.spi.ActionQueue.addResolvedEntityInsertAction(ActionQueue.java:291)
 	at org.hibernate.engine.spi.ActionQueue.addInsertAction(ActionQueue.java:272)
 	at org.hibernate.engine.spi.ActionQueue.addAction(ActionQueue.java:322)
 	at org.hibernate.event.internal.AbstractSaveEventListener.addInsertAction(AbstractSaveEventListener.java:386)
 	at org.hibernate.event.internal.AbstractSaveEventListener.performSaveOrReplicate(AbstractSaveEventListener.java:300)
 	at org.hibernate.event.internal.AbstractSaveEventListener.performSave(AbstractSaveEventListener.java:219)
 	at org.hibernate.event.internal.AbstractSaveEventListener.saveWithGeneratedId(AbstractSaveEventListener.java:134)
 	at org.hibernate.event.internal.DefaultMergeEventListener.saveTransientEntity(DefaultMergeEventListener.java:378)
 	at org.hibernate.event.internal.DefaultMergeEventListener.entityIsTransient(DefaultMergeEventListener.java:298)
 	at org.hibernate.event.internal.DefaultMergeEventListener.merge(DefaultMergeEventListener.java:211)
 	at org.hibernate.event.internal.DefaultMergeEventListener.doMerge(DefaultMergeEventListener.java:146)
 	at org.hibernate.event.internal.DefaultMergeEventListener.onMerge(DefaultMergeEventListener.java:130)
 	at org.hibernate.event.service.internal.EventListenerGroupImpl.fireEventOnEachListener(EventListenerGroupImpl.java:138)
 	at org.hibernate.internal.SessionImpl.fireMerge(SessionImpl.java:868)
 	at org.hibernate.internal.SessionImpl.merge(SessionImpl.java:839)
 	at org.hibernate.engine.spi.CascadingActions$6.cascade(CascadingActions.java:253)
 	at org.hibernate.engine.spi.CascadingActions$6.cascade(CascadingActions.java:243)
 	at org.hibernate.engine.internal.Cascade.cascadeToOne(Cascade.java:517)
 	at org.hibernate.engine.internal.Cascade.cascadeAssociation(Cascade.java:439)
 	at org.hibernate.engine.internal.Cascade.cascadeProperty(Cascade.java:224)
 	at org.hibernate.engine.internal.Cascade.cascadeCollectionElements(Cascade.java:551)
 	at org.hibernate.engine.internal.Cascade.cascadeCollection(Cascade.java:481)
 	at org.hibernate.engine.internal.Cascade.cascadeAssociation(Cascade.java:442)
 	at org.hibernate.engine.internal.Cascade.cascadeProperty(Cascade.java:224)
 	at org.hibernate.engine.internal.Cascade.cascade(Cascade.java:157)
 	at org.hibernate.event.internal.DefaultMergeEventListener.cascadeOnMerge(DefaultMergeEventListener.java:644)
 	at org.hibernate.event.internal.DefaultMergeEventListener.entityIsPersistent(DefaultMergeEventListener.java:278)
 	at org.hibernate.event.internal.DefaultMergeEventListener.merge(DefaultMergeEventListener.java:214)
 	at org.hibernate.event.internal.DefaultMergeEventListener.doMerge(DefaultMergeEventListener.java:146)
 	at org.hibernate.event.internal.DefaultMergeEventListener.onMerge(DefaultMergeEventListener.java:130)
 	at org.hibernate.event.internal.DefaultMergeEventListener.onMerge(DefaultMergeEventListener.java:84)
 	at org.hibernate.event.service.internal.EventListenerGroupImpl.fireEventOnEachListener(EventListenerGroupImpl.java:127)
 	at org.hibernate.internal.SessionImpl.fireMerge(SessionImpl.java:847)
 	at org.hibernate.internal.SessionImpl.merge(SessionImpl.java:833)

The issue stems (in both cases) from the problematic call at org.hibernate.event.internal.DefaultMergeEventListener.saveTransientEntity(DefaultMergeEventListener.java:378).

I should also mention, for completeness sake, that we are using Kotlin instead of Java, but I don’t think that would have any bearing with regards to the observed behavior.

To better illustrate the issue, I created a test case.
Am I making any erroneous assumptions regarding how listeners work, or would this be considered a bug?

Any help is much appreciated!

Sorry if I couldn’t get to you earlier @jpmsilva. While merge is the recommended method to migrate to from saveOrUpdate, its internal semantics are not a 1-to-1 replacement: there are differences in how an entity and its associations are persisted and in how cascade is handled.

Looking at the code that you kindly pointed out, it seems we only copy the state of associations owned by the entity being merged, while inverse-side associations (owned by other entities) get subsequently cascaded and copied after the provided object is already in a persistent state. This is probably a requirement for Persistence Context consistency during insertions, but I didn’t dive into your example to check yet.

In conclusion, I don’t think this is a bug - unless explicitly stated in the spec or documentation, or if in a previous Hibernate version merge behaved differently. Ideally, your own logic on whether to persist a new entity or not shouldn’t need to rely on an event listener, especially when this logic is based on the entity’s unowned associations (which are technically not part of its persistent state), but would be able to understand whether to merge the entity or not in advance.

Thank you for your input @mbladel

The JPA spec has this to say about the life cycle callbacks:

The PrePersist and PreRemove callback methods are invoked for a given entity before the respective EntityManager persist and remove operations for that entity are executed. For entities to which the merge operation has been applied and causes the creation of newly managed instances, the PrePersist callback methods will be invoked for the managed instance after the entity state has been copied to it. These PrePersist and PreRemove callbacks will also be invoked on all entities to which these operations are cascaded. The PrePersist and PreRemove methods will always be invoked as part of the synchronous persist, merge, and remove operations.

Of relevance is this part: the PrePersist callback methods will be invoked for the managed instance after the entity state has been copied to it. It does not differentiate between owned and unowned relations.

I would assume that an Hibernate user would find it reasonable to not see half-copied entities in the payload of the pre database events, especially since this works with the now deprecated saveOrUpdate call.
I did not find any references to this behavior in the documentation:

In summary, I think the current observed behavior in native Hibernate events is unexpected from the API user point of view, and possibly against what the JPA spec says about life cycle callbacks.

I see your point, but the spec states in section 3.2.4:

Bidirectional relationships between managed entities will be persisted based on references held by the owning side of the relationship. It is the developer’s responsibility to keep the in-memory references held on the owning side and those held on the inverse side consistent with each other when they change

From this one could argue that inverse or unowned associations are not part of the “entity state”, as they do not affect what is actually persisted to the database, which is part of the reason why they are not copied when merging.

To pass a consistent entity state to PrePersist we would somehow need to first wait until all inverse associations have been cascaded, since the collection contents depend on the actual persistent state of your entities (i.e. what is actually on the database) and not what is in the entity instance you passed to merge, but I don’t think we can do that because these operations happen in different moments.

I can try bringing this up with the team to consider possible improvements, but I do not think this is a bug.

I understand the need to manually keep the in-memory references in sync — my sample case does this manual sync as well:

		expense.travel = travel;
		travel.expenses.add(expense);

I’m not familiar with the inner workings of the DefaultMergeEventListener/AbstractSaveEventListener, but what would happen if the unowned relations were to be copied prior to the cascading insert event took place? Do you happen to know if the cascading mechanism is reliant on this particular behavior?

I understand that merge is not a direct replacement for saveAndUpdate — which does not immediately perform the insertions/updates upon being called, as opposed to merge. Still, we can see that nevertheless it should be possible to have complete copies of the entities being passed to the listeners. If the deprecated saveAndUpdate is able to do this correctly, and merge is not, what should be used instead?

If listeners and life cycle callbacks cannot be trusted to receive accurate representations of the entities they need to act upon, I would argue that this behavior diminishes the usefulness of said features, and might benefit of some note to that effect in the documentation.

And again, thank you very much for your valuable input!

As a naïve test, I tried moving up the call copyValues( persister, entity, copy, session, copyCache, ForeignKeyDirection.TO_PARENT ); so it happens sooner, and right away multiple tests failed, so I guess that answers my own question.

Something else that is confusing me is related to the owning side of many-to-many relations, as stated here.

The non-owning side of the relation has the mappedBy attribute. In my sample test case, that means that in the ExpenseEntry.supportingFiles relation, the ExpenseEntry is the owning side. And per:

Would that not mean that the supportingFiles collection are a part of the persistent state of the ExpenseEntry entity?

Thank you!

A bit of extra information: I was hoping that running our sanity checks in a PostInsertEventListener and by throwing an exception, would be a possible alternative. It turns out, however, that even at PostInsert the entity in the event payload is incomplete and missing the TO_PARENT relations — I added some test cases in the reproduction repository.

Would there be any event handler we can hook into, that sees the complete entity and that could be used to rollback the transaction?

This is not surprising, the PostInsertEventListener is invoked in the same phase of the merge, i.e. when the actual entity instance is persisted, so the actual entity state will not yet contain the collections.

Would there be any event handler we can hook into

There are no other event handlers that you can leverage to achieve what you’re trying to do; as I said before, I believe you shouldn’t rely on entity listeners or lifecycle methods if you need to apply custom business logic when actually persisting entities, you should simply avoid calling merge in the first place.

As a small note, as you’re already manually determining wheter you need to update or to insert a new entity instance, you could try using persist instead of merge and see if your existing approach works with that.

I’m having some difficulty conciliating the view that we should not have custom logic in event listeners with what the API says it does and can be used for. The examples show event listeners being used to implement custom security checks to ensure that a user should be authorized do perform a certain change. That is conceptually not much different than what we have been doing: checks regarding what state the entity should be in in order for a change to go through or denied.

If in some cases the event system does not guarantee the entity to be correctly filled, I would respectably submit that this severely reduces the usefulness of the event system. A user of the Hibernate Event API would need to know what combinations of events types and relation types are guaranteed to be filled/not filled when a JPA callback is invoked.

Finally, I’m still unsure about the statement that the supportingFiles collection is not a part of the persistent state of the ExpenseEntry entity. The expense entry in this case is the owning side of the bi-di many-to-many relation (or uni many-to-many relation — both cases fail similarly). Why would it not be filled, under the rule invoked from the JPA spec section 3.2.4?

Thank you for your time with this!