MassIndexer delete data based on condition

meneghette · February 8, 2024, 7:24pm

Hello I’m tring to execute MassINdexer with purgeAllOnStart = false and with condition

Example

massIndexer.type( WorkflowDefinition.class ).reindexOnly( “type= :type” ).param(“type”,“SUB”);

When I do that will bre reindex all the entities with this condition, but it did not delete the entities that not exists on database.

For example we have 8 records, if I delete 1 record direct on database and do the reindex, it will keep on the elasticsearch the entity deleted

If set purgeAllOnStart = true, delete everything, but how we have a lot of records, it will take a long time

Is possible when purgeAllOnStart = false, the condition first delete on the index and then reindex the data?

yrodiere · February 9, 2024, 8:04am

Hello,

This is how it’s expected to work; see Hibernate Search 7.0.0.Final: Reference Documentation :

Even if the reindexing is applied on a subset of entities, by default all entities will be purged at the start. The purge can be disabled completely, but when enabled there is no way to filter the entities that will be purged.

What you’d need is either HSEARCH-3304 or HSEARCH-1032.

Unless you want to contribute one of these, and until someone else does, your can use the following workarounds:

Avoid letting your index get out of sync in the first place; that could be possible if the cause is just some JPA batch job.
Delete entities from the index manually; that probably only realistic if there is a small to medium number of entities.
On Elasticsearch/OpenSearch only, retrieve the REST client and send a delete_by_query request to Elasticsearch/OpenSearch.

meneghette · February 9, 2024, 10:40am

I would like to contribute , where I can found the steps to do that?

yrodiere · February 9, 2024, 10:43am

Thanks!

All necessary information should be here: hibernate-search/CONTRIBUTING.md at main · hibernate/hibernate-search · GitHub

Feel free to reach out to the team if you need help. In particular, it might be a good idea to discuss your idea for any new APIs on Zulip before spending too much time on it – that’ could spare you some wasted effort.

yrodiere · February 9, 2024, 10:51am

I updated the description of HSEARCH-3304 .

I’d recommend starting with that one, as it will most likely be necessary to implement HSEARCH-1032, which could also turn out to be much more complex.

Topic		Replies	Views
In case that purgeAllOnStart is false Hibernate Search	3	493	August 4, 2022
HS6: About massIndexer Hibernate Search	7	961	May 4, 2021
MassIndexer Changes in Hibernate 6 Hibernate Search	2	911	July 13, 2020
How to massindexer a specific class with condition Hibernate Search	3	27	November 27, 2024
Massindexer / purgeAll does not release IndexReaders Hibernate Search	3	202	March 18, 2024

MassIndexer delete data based on condition

Related topics