MassIndexer SocketTimeOut on purge Index with 25'000'000 entites

I am writing to you because MassIndexer throws a SocketTimeoutException. The first time the index is filled, the exception does not occur. The second time MassIndexing is done, the index is purged and while the index is being purged, a SocketTimeoutException is thrown. The mass indexing then takes place.

Some Facts about the Setup:

  • Elasticsearch Engine
  • Around 25’000’000 entities are indexed.
  • First mass indexing - no problem
  • Second mass indexing - SocketTimeoutException on operation:
    POST /application-a03-meldung-write/_delete_by_query with parameters {conflicts=proceed}

Elasticsearch needs it’s time to clean up the index. The MassIndexer runs into a SocketTimeout for this duration. Therefore, I thought to clear the index manually before the MassIndexer. However, this did not really help.

Are you familiar with this phenomenon? Do you have a tip on how to prevent the SocketTimeOut?

Stacktrace

12:41:58.274 [xxxMassIndexServiceImpl] ERROR c.b.a.c.c.b.s.i.AbstractJobServiceImpl - Batch gibt auf
org.hibernate.search.util.common.SearchException: HSEARCH700042: 1 failure(s) occurred during mass indexing. See the logs for details. First failure: HSEARCH400007: Elasticsearch request failed: 30’000 milliseconds timeout on connection http-outgoing-16 [ACTIVE]
Request: POST /application-a03-meldung-write/_delete_by_query with parameters {conflicts=proceed}
Response: (no response)
	at org.hibernate.search.mapper.pojo.massindexing.impl.PojoMassIndexingNotifier.reportIndexingCompleted(PojoMassIndexingNotifier.java:146)
	at org.hibernate.search.mapper.pojo.massindexing.impl.PojoMassIndexingBatchCoordinator.notifyFailure(PojoMassIndexingBatchCoordinator.java:254)
	at org.hibernate.search.mapper.pojo.massindexing.impl.PojoMassIndexingFailureHandledRunnable.run(PojoMassIndexingFailureHandledRunnable.java:85)
	at org.hibernate.search.mapper.pojo.massindexing.impl.PojoDefaultMassIndexer.startAndWait(PojoDefaultMassIndexer.java:153)
	at org.hibernate.search.mapper.orm.massindexing.impl.HibernateOrmMassIndexer.startAndWait(HibernateOrmMassIndexer.java:102)
	at ch.xxx.a03.XXX.xxxinfo.business.service.xxxMassIndexServiceImpl.lambda$runBusinessMethod$0(xxxMassIndexServiceImpl.java:41)
	at org.springframework.transaction.support.TransactionTemplate.execute(TransactionTemplate.java:140)
	at ch.xxx.a03.XXX.xxxinfo.business.service.xxxMassIndexServiceImpl.runBusinessMethod(xxxMassIndexServiceImpl.java:37)
	at ch.xxx.a03.XXX.common.business.service.impl.AbstractJobServiceImpl$WorkerThread.run(AbstractJobServiceImpl.java:67)
Caused by: org.hibernate.search.util.common.SearchException: HSEARCH400007: Elasticsearch request failed: 30’000 milliseconds timeout on connection http-outgoing-16 [ACTIVE]
Request: POST /application-a03-meldung-write/_delete_by_query with parameters {conflicts=proceed}
Response: (no response)
	at org.hibernate.search.backend.elasticsearch.work.impl.AbstractNonBulkableWork.lambda$execute$2(AbstractNonBulkableWork.java:60)
	at org.hibernate.search.util.common.impl.Futures.lambda$handler$1(Futures.java:63)
	at java.base/java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:990)
	at java.base/java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:974)
	at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510)
	at java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2162)
	at org.hibernate.search.backend.elasticsearch.client.impl.ElasticsearchClientImpl$1.onFailure(ElasticsearchClientImpl.java:131)
	at org.elasticsearch.client.RestClient$FailureTrackingResponseListener.onDefinitiveFailure(RestClient.java:686)
	at org.elasticsearch.client.RestClient$1.failed(RestClient.java:422)
	at org.apache.http.concurrent.BasicFuture.failed(BasicFuture.java:137)
	at org.apache.http.impl.nio.client.DefaultClientExchangeHandlerImpl.executionFailed(DefaultClientExchangeHandlerImpl.java:101)
	at org.apache.http.impl.nio.client.AbstractClientExchangeHandler.failed(AbstractClientExchangeHandler.java:432)
	at org.apache.http.nio.protocol.HttpAsyncRequestExecutor.timeout(HttpAsyncRequestExecutor.java:387)
	at org.apache.http.impl.nio.client.InternalIODispatch.onTimeout(InternalIODispatch.java:98)
	at org.apache.http.impl.nio.client.InternalIODispatch.onTimeout(InternalIODispatch.java:40)
	at org.apache.http.impl.nio.reactor.AbstractIODispatch.timeout(AbstractIODispatch.java:175)
	at org.apache.http.impl.nio.reactor.BaseIOReactor.sessionTimedOut(BaseIOReactor.java:261)
	at org.apache.http.impl.nio.reactor.AbstractIOReactor.timeoutCheck(AbstractIOReactor.java:506)
	at org.apache.http.impl.nio.reactor.BaseIOReactor.validate(BaseIOReactor.java:211)
	at org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:280)
	at org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:104)
	at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:591)
	at java.base/java.lang.Thread.run(Thread.java:833)
Caused by: java.net.SocketTimeoutException: 30’000 milliseconds timeout on connection http-outgoing-16 [ACTIVE]
	... 11 common frames omitted
13:01:45.066 [catalina-exec-6] ERROR c.b.a.c.c.w.w.f.XXXSoapFaultInterceptor - Der Request konnte wegen einem technischen Fehler in der XXX nicht verarbeitet werden.
org.hibernate.search.util.common.SearchException: HSEARCH400007: Elasticsearch request failed: 30’000 milliseconds timeout on connection http-outgoing-24 [ACTIVE]
Request: POST /application-a03-meldung-read/_search with parameters {size=100, track_total_hits=true}
Response: (no response)
	at org.hibernate.search.backend.elasticsearch.work.impl.AbstractNonBulkableWork.lambda$execute$2(AbstractNonBulkableWork.java:60)
	at org.hibernate.search.util.common.impl.Futures.lambda$handler$1(Futures.java:63)
	at java.base/java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:990)
	at java.base/java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:974)
	at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510)
	at java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2162)
	at org.hibernate.search.backend.elasticsearch.client.impl.ElasticsearchClientImpl$1.onFailure(ElasticsearchClientImpl.java:131)
	


12:41:58.272 [XXXMassIndexServiceImpl] INFO  o.h.s.m.p.m.i.PojoMassIndexingLoggingMonitor - HSEARCH000028: Mass indexing complete. Indexed 0 entities.

MassIndexer Config:

MassIndexer indexer = searchSession.massIndexer(Meldung.class, Altdaten.class);
indexer.threadsToLoadObjects(numberOfThreads);
indexer.startAndWait();

Many thanks, Marc-Antoine

Elasticsearch is very slow to delete large numbers of documents, and there’s little we can do to work around that beside raising timeouts to absurd levels. I suppose we could also take advantage of Elasticsearch’s “asynchronous tasks” API (or whatever it’s called) to poll the cluster when waiting for the deletion to finish, but that’s not really a priority as I’d like to implement better “deletion” strategies eventually (like creating a new index and switching aliases).

For now, if you’re not using multitenancy, I’d recommend simply dropping and recreating the schema when you reindex: it’s almost instantaneous and functionnally the same, unless you made manual changes to your schema (and even then you can have Hibernate Search reproduce changes to your mapping and settings).

There’s a built-in option to do just that; just remember to disable the initial purge as it will become pointless (though Search 6.2 should disable it automatically):

MassIndexer indexer = searchSession.massIndexer(Meldung.class, Altdaten.class);
indexer.threadsToLoadObjects(numberOfThreads);
indexer.purgeAllOnStart(false).dropAndCreateSchemaOnStart(true);
indexer.startAndWait();