HS6 corrupting index after merge of detached entity with @OneToMany association

busitech · September 20, 2021, 10:08am

…and so a LazyInitializationException can be avoided, as alluded to in the HS6 Migration Guide, right?. This is what we were referring to. This is one function of our interceptor, which did for HS5 what the buffering now does for HS6.

I only mentioned this because the flush(); clear(); use case was still possible for us to implement in HS5, it just took a little more work to get it accomplished, and that work is contained in our interceptor. The fact that your implementation had the side-effect of being a breaking change for other use cases is unrelated to our interceptor, which is going to be irrelevant from the perspective of HS6. (It will still be useful for our serialization process.) No other connection to the issues we’re having was intended to be stated or implied, and none of our interceptor logic is found in the test cases.

This is the main problem that we both face. It’s a troublesome issue, one that we should mention to the Hibernate team somehow.

It has not posed much of a problem for us until now, as we simply used refresh(); to revitalize any objects returned in this half-baked state. flush(); is required prior to refresh(); so that new property values that were just merged do not get overwritten, otherwise we probably would not call flush() much.

However, I believe that flush(); can be triggered in other ways, such as by running certain hql queries.

So yes, this is how we have been compensating for the way ORM returns objects that get reattached to the Session / EntityManager. We currently know of no better way to completely reattach a detached object to the current Session, as if it had been read from it initially, with two lines of code: flush(); refresh();, and until now, have not had a reason to go looking for a better one.

I am not aware if ORM exposes an API which can turn an association into a HibernateProxy on demand. If it does, we could loop through all of our association properties after a persist or merge of a detached object, and do this on our own. It would be best if ORM did this internally. We would also be able to avoid an early flush();, because our call to refresh(); would no longer be necessary. More on this below…

That’s correct, because ORM has no need for us to do so. When the other side can be lazy loaded on demand (and could be huge), there has been no need to hard code this. The @OneToMany with mappedBy relationship just a convenience method to automatically initiate a left join query to obtain a collection on demand; it isn’t a concept found in the database schema. Creating or merging, the other side is commonly going to be huge either way, and generally should not be initialized (unless it is indexed) just to add one more record to the list which will be found anyway by lazy loading.

Absolutely, and this is no issue for ORM either. Almost all @ManyToOne properties are set on objects by the client side (Windows, macOS, iPhone, etc.), and sent back to the server to persist or merge. We rarely create entity objects in Java, except by deserialization off the wire.

These incoming objects were once loaded by a Session, were serialized and sent to the client, were then displayed by the client UI, and got sent back to the server, at which point they are, of course, detached. The fastest and most efficient thing to do on the server side at this point is to validate, persist or merge, and then refresh to obtain “armed” proxies for loading associations on demand so that the client can receive the latest hierarchical tree.

Trying to manage “the other side” for every @ManyToOne association would be a performance disaster to initialize them just to add one more entity to the list, unless of course the collection is indexed, and if it is, HS can simply traverse the relationship which will do the loading for us, without hard-coding. Most collections are lazy and are not sent to the client.

This would be excellent.

I hope I have always been clear and constructive: yes, this is one way to describe the situation. You would think that after a certain use case has been supported for more than 12 years however, it might be eligible to be promoted from “workaround” status to a bone fide use case, no?

Hindsight is always 20/20.

With our use case, Hibernate Search 6 was not backwards compatible with previous releases, and you are right, it has never been able to cope with these behaviors that Hibernate really has, and has always had for the 17 years we’ve been using it, and yet no unit test revealed the issue. We just found ways to cope and moved on without thinking much of it. The facilities and timings were there to make everything right in the end, and that is what mattered the most.

We do hope that more unit tests will be added that exercise Hibernate with our use case, with objects that have been reattached to the Session, so we will be protected in future releases.

Working with detached but still in sync entity data is a very common pattern in n-tier applications. All of our entities have @Version columns which make sure the client had the most up-to-date copy at the time we try to merge.

We agree, solution #2 sounds better than solution #1. Your choice is fine regarding if it is a config property (persistence.xml) or an API to register an event listener.

First, I would like to go take a look at the APIs that Hibernate uses to proxy-ize objects, to see if I could use them. If I can, you might not have to make any changes if I pass all objects coming in from clients through a routine that would replace all references with proxies. If that is possible, then you wouldn’t have to make any changes at all.

We only use flush as a way to protect our merged data from the next refresh();, and we only use refresh(); so we can obtain proxies for lazy loading. We don’t care that much about flush(); by itself.

There was a time when we needed to use refresh(); after persist(); to obtain the database assigned @Id property value, so we could move forward with using the new object as a reference, rather than seeing a Hibernate exception because the @Id was still zero, which looks like an entity not saved.

I need to do some testing to see if this is still necessary in the lastest ORM release with the JDBC drivers we use today. We use identity properties at the database level, and there was a time when the database dialect was not augmented to the extent of being aware of how to update the @Id after insert. We could also write a new dialect if needed.

Thank you for all the help, @yrodiere! Let’s keep brainstorming the best short-term approach, and then begin the process of achieving the long-term goals as well. I’ll be working on this until we arrive at a good stopping point.

Topic		Replies	Views
HS6 not indexing add or delete, only update with @OneToMany @IndexedEmbedded Hibernate Search	14	1469	September 18, 2021
Do you find or refresh after persist or merge, just to use lazy loading? Hibernate ORM	0	778	November 22, 2021
HS 6 index update problem Hibernate Search	4	425	September 16, 2020
Issue with the Hibernate Search index Hibernate Search	2	424	September 2, 2021
HSEARCH700088: Invalid indexing request: the add and update operations require a non-null entity Hibernate Search	7	1009	November 12, 2022

HS6 corrupting index after merge of detached entity with @OneToMany association

Related topics