I am observing sometimes strange behavior (inconsistency) when I do fast changes (milliseconds) in database for the same record.
I create record in database that results in OpenSearch call:
curl -iX POST ‘https://xxxx.es.amazonaws.com/_bulk’ -d '{“index”:{“_index”:“xxx-write”,“_id”:“T1_1”,“routing”:“T1”}}
{“data”:“test1”, “_entity_type”:“Xxxx”,“_tenant_id”:“T1”,“_tenant_doc_id”:“1”}
’
Response is:
HTTP/1.1 200 OK
{“took”:4,“errors”:false,“items”:[{“index”:{“_index”:“xxx-000001”,“_id”:“T1_1”,“_version”:1,“result”:“created”,“_shards”:{“total”:2,“successful”:2,“failed”:0},“_seq_no”:26463,“_primary_term”:1,“status”:201}}]}
Update of the same record in database - change value to “test2”
curl -iX POST ‘https://xxxx.es.amazonaws.com/_bulk’ -d '{“index”:{“_index”:“xxx-write”,“_id”:“T1_1”,“routing”:“T1”}}
{“data”:“test2”, “_entity_type”:“Xxxx”,“_tenant_id”:“T1”,“_tenant_doc_id”:“1”}
’
Response is:
HTTP/1.1 200 OK
{“took”:5,“errors”:false,“items”:[{“index”:{“_index”:“xxx-000001”,“_id”:“T1_1”,“_version”:3,“result”:“updated”,“_shards”:{“total”:2,“successful”:2,“failed”:0},“_seq_no”:26515,“_primary_term”:1,“status”:200}}]}
Another update of the same record in database - change value to “test3”
curl -iX POST ‘https://xxxx.es.amazonaws.com/_bulk’ -d '{“index”:{“_index”:“xxx-write”,“_id”:“T1_1”,“routing”:“T1”}}
{“data”:“test3”, “_entity_type”:“Xxxx”,“_tenant_id”:“T1”,“_tenant_doc_id”:“1”}
’
Response is:
HTTP/1.1 200 OK
{“took”:6,“errors”:false,“items”:[{“index”:{“_index”:“xxx-000001”,“_id”:“T1_1”,“_version”:2,“result”:“updated”,“_shards”:{“total”:2,“successful”:2,“failed”:0},“_seq_no”:26506,“_primary_term”:1,“status”:200}}]}
I have in database correct value “test3”.
But the issue is with value in OpenSearch where I have value “test2” instead of “test3”.
For some reason the request to OpenSearch for update in point 3) was processed before update in point 2).
It can be seen on version and _seq_no from the OpenSearch response.
I am using “hibernate.search.indexing.plan.synchronization.strategy”=“write-sync”.
“hibernate.core.version” = 6.6.1.Final
“hibernate.search” = 7.2.1.Final
This happens occasionally.
Do you please know what is the reason for this issue and how to fix it?
I see this in responses: total”:2,“successful”:2. Do you have multiple OpenSearch nodes, or just two shards (e.g. primary/replica) on the same node? What’s your OpenSearch setup exactly?
It’s a single application that’s running in multiple nodes. So the call to database can be done from any node.
It’s always a single transaction that can come from any node. So one update can come from node1 and the second one from node2 but those are always independent single transactions Resulting in correct values in database. I have constrains on table and a trigger that will rollback transaction if data are wrong (wrong version).
Logs shows 3 separate _bulk requests.
I am not setting “hibernate.search.coordination.strategy” at all. So no coordination.
OpenSearch 2.11 - OpenSearch_2_11_R20241003, 3-AZ without standby, 3 data nodes. Index split into 64 shards.
Ok, well that’s not Hibernate Search logs. No idea where this comes from.
That’s your problem right there. If your second and third operations happen on different nodes, it’s totally possible that the indexing requests are sent to Elasticsearch out of order, because of e.g. garbage collector pauses or just thread scheduling. Or they are sent in order, but due to network latency variations, are received by Elasticsearch out of order.
With a single application node it’s not usually a problem, because there are in-JVM mechanisms to preserve operation order (ES requests are executed in the same order as the order of transaction completion). With multiple nodes, there’s no such protection.
In your particular case optimistic concurrency control could, maybe, have helped, though picking a suitable version number could be challenging. But anyway, nobody cared enough so far to contribute an implementation of that, so you can’t use it right now.
I am not sure that I can use outbox-polling coordination because it needs a list of tenants up front in the configuration. And I have a solution where new tenants are added dynamically. Also each tenant has it’s own database.
This should not be a problem, as long as it works in Hibernate ORM.
This will be a problem indeed. Hibernate Search needs to start an agent for each tenant – especially if each has its own database. Without knowing the list of tenants, it can’t start agents and can’t process events.
Maybe there could be a new feature in a future version of Hibernate Search where applications can list tenants for Hibernate Search on startup (e.g. they retrieve them from a DB), and later “notify” Hibernate Search about newly added tenants, so that Hibernate Search starts agents accordingly. But so far nobody requested that feature, nor offered to work on it