Concurrent Updates and Index Update Questions

simi · April 13, 2020, 4:29pm

I’m currently running Hibernate Search 5.10.6.Final against a 6.2 Elasticsearch instance and have had some interesting issues with the index getting out of sync by calls from our client. I’m wondering if this is a known issue.

I have 2 instances of the application running, and the client is making multiple updates to the same entity from different threads. There are no optimistic concurrency exceptions in the log (the database is correct) but the index seems to be out of sync. A re-index “fixes” this. Is it possible (or likely) that the concurrent updates to the same entity might cause this - like the index update overwrites a change it doesn’t yet know about?

I’m having the the owner of the client attempt to serialize these calls to see if that truly is the culprit - but was wondering if perhaps this is a known issue?

Thanks for any insight!

Chris

yrodiere · April 14, 2020, 6:29am

It is indeed a known problem. Due to how events are sourced and how data is sourced in Hibernate Search, this problem can occur if two transactions in two separate instances of the application require reindexing of a same entity at the same time: in this case, it’s possible that each instance of the application applies changes to a different part of the indexed document, and the last update will win.

Note that this problem will only occur if two transactions occur in parallel and trigger the reindexing of the same entity.

Unfortunately, that kind of problem cannot be solved with Elasticsearch’s optimistic concurrency control (_version). There are plans to solve this in Hibernate Search 6 by moving the part of reindexing where we extract data from entities to a separate, out-of-transaction process, so that reindexing is effectively serialized for a given entity. But that will take time, as we have other matters to address in Search 6 first.

In the meantime, yes, your options are:

to reindex periodically to fix the few problems that will invariably occur over time. That can be done with zero downtime in Hibernate Search 6 with Elasticsearch, with a bit of work on your side.
or to serialize the changes affecting that problematic entity yourself, so that the problems never occur in the first place.

simi · April 14, 2020, 11:38am

Thank you for the quick response. As per your comments - the client has successfully reproduced and fixed the issue by moving from parallel to serial access when performing updates. Interesting news on version 6 - thanks for the heads up.

Chris

Topic		Replies	Views
Hibernate-Search 6 - Optimistic Locking Support Hibernate Search	2	590	September 26, 2019
Index triggered many times for each change in the entity Hibernate Search	1	607	October 22, 2019
Hibernate Search 6 + Elastic Search 7+ Integration Test Hibernate Search	3	751	September 24, 2020
Is it possible for HibernateSearch to not index data if the app is shutdown whilst updating Hibernate Search	4	491	August 6, 2020
How does Hibernate Search update index when updating an entity in a relational database? Hibernate Search	1	761	December 12, 2022

Concurrent Updates and Index Update Questions

Related topics