Lucene index corruption

#1

Hi.
We are using Hibernate Search in our web app (with connection pooling), as well as separate scripts that run as crons.

We write to the index 3 different ways:

  1. realtime, in the wep app (with connection pool), when a user creates or edits an entity, by manually calling
    org.hibernate.search.jpa.Search.getFullTextEntityManager(entityManager).index(fooEntity);

  2. via cron, every 5 minutes, for entities awaiting indexing in our SEARCH_QUEUE_INDEX table;

  3. via cron, every 12 hours, where we rebuild the entire index from scratch, to ensure nothing fell through the cracks.

To ensure 2 processes don’t write to the index at once, we use a db table called SEARCH_QUEUE_INDEX, to manage our write locking, to ensure only one process is writing the index at once.

But, it seems that this isn’t working right. Our index is becoming corrupted.

We are seeing 2 types of errors in our logs:

  1. Unable to reopen IndexReader

  2. HSEARCH000058: HSEARCH000117: IOException on the IndexWriter

Here’s the stack traces for each:
1)

org.hibernate.search.exception.SearchException: Unable to reopen IndexReader
	at org.hibernate.search.indexes.impl.SharingBufferReaderProvider$PerDirectoryLatestReader.refreshAndGet(SharingBufferReaderProvider.java:243)
	at org.hibernate.search.indexes.impl.SharingBufferReaderProvider.openIndexReader(SharingBufferReaderProvider.java:74)
	at org.hibernate.search.indexes.impl.SharingBufferReaderProvider.openIndexReader(SharingBufferReaderProvider.java:36)
	at org.hibernate.search.reader.impl.ManagedMultiReader.createInstance(ManagedMultiReader.java:70)
	at org.hibernate.search.reader.impl.MultiReaderFactory.openReader(MultiReaderFactory.java:49)
	at org.hibernate.search.query.engine.impl.LuceneHSQuery.buildSearcher(LuceneHSQuery.java:482)
	at org.hibernate.search.query.engine.impl.LuceneHSQuery.queryResultSize(LuceneHSQuery.java:222)
	at org.hibernate.search.query.hibernate.impl.FullTextQueryImpl.doGetResultSize(FullTextQueryImpl.java:272)
	at org.hibernate.search.query.hibernate.impl.FullTextQueryImpl.getResultSize(FullTextQueryImpl.java:263)
	at foo.Search.executeSearch(Search.java:248)
Caused by: org.apache.lucene.index.CorruptIndexException: file mismatch, expected id=3vrtzq0x94ltu15kzym458w8p, got=1rows74yewkm4pwz102cqzr70 (resource=BufferedChecksumIndexInput(MMapIndexInput(path="/path/to/lucene_index/FooEntity/_3j7.si")))
	at org.apache.lucene.codecs.CodecUtil.checkIndexHeaderID(CodecUtil.java:266)
	at org.apache.lucene.codecs.CodecUtil.checkIndexHeader(CodecUtil.java:256)
	at org.apache.lucene.codecs.lucene50.Lucene50SegmentInfoFormat.read(Lucene50SegmentInfoFormat.java:86)
	at org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:362)
	at org.apache.lucene.index.SegmentInfos$1.doBody(SegmentInfos.java:493)
	at org.apache.lucene.index.SegmentInfos$1.doBody(SegmentInfos.java:490)
	at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:731)
	at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:683)
	at org.apache.lucene.index.SegmentInfos.readLatestCommit(SegmentInfos.java:490)
	at org.apache.lucene.index.StandardDirectoryReader.isCurrent(StandardDirectoryReader.java:344)
	at org.apache.lucene.index.StandardDirectoryReader.doOpenNoWriter(StandardDirectoryReader.java:300)
	at org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:263)
	at org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:251)
	at org.apache.lucene.index.DirectoryReader.openIfChanged(DirectoryReader.java:137)
	at org.hibernate.search.indexes.impl.SharingBufferReaderProvider$PerDirectoryLatestReader.refreshAndGet(SharingBufferReaderProvider.java:240)
	... 61 more
	Suppressed: org.apache.lucene.index.CorruptIndexException: checksum passed (9573fec2). possibly transient resource issue, or a Lucene or JVM bug (resource=BufferedChecksumIndexInput(MMapIndexInput(path="/path/to/lucene_index/FooEntity/_3j7.si")))
		at org.apache.lucene.codecs.CodecUtil.checkFooter(CodecUtil.java:379)
		at org.apache.lucene.codecs.lucene50.Lucene50SegmentInfoFormat.read(Lucene50SegmentInfoFormat.java:117)
		... 73 more
org.hibernate.search.exception.impl.LogErrorHandler - HSEARCH000058: HSEARCH000117: IOException on the IndexWriter
java.nio.file.NoSuchFileException: /path/to/lucene_index/FooEntity/_1s6.cfe
	at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)
	at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
	at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
	at sun.nio.fs.UnixFileSystemProvider.newFileChannel(UnixFileSystemProvider.java:177)
	at java.nio.channels.FileChannel.open(FileChannel.java:287)
	at java.nio.channels.FileChannel.open(FileChannel.java:335)
	at org.apache.lucene.store.MMapDirectory.openInput(MMapDirectory.java:237)
	at org.apache.lucene.store.Directory.openChecksumInput(Directory.java:109)
	at org.apache.lucene.codecs.lucene50.Lucene50CompoundReader.readEntries(Lucene50CompoundReader.java:105)
	at org.apache.lucene.codecs.lucene50.Lucene50CompoundReader.<init>(Lucene50CompoundReader.java:69)
	at org.apache.lucene.codecs.lucene50.Lucene50CompoundFormat.getCompoundReader(Lucene50CompoundFormat.java:71)
	at org.apache.lucene.index.SegmentCoreReaders.<init>(SegmentCoreReaders.java:93)
	at org.apache.lucene.index.SegmentReader.<init>(SegmentReader.java:65)
	at org.apache.lucene.index.ReadersAndUpdates.getReader(ReadersAndUpdates.java:145)
	at org.apache.lucene.index.BufferedUpdatesStream$SegmentState.<init>(BufferedUpdatesStream.java:390)
	at org.apache.lucene.index.BufferedUpdatesStream.openSegmentStates(BufferedUpdatesStream.java:422)
	at org.apache.lucene.index.BufferedUpdatesStream.applyDeletesAndUpdates(BufferedUpdatesStream.java:267)
	at org.apache.lucene.index.IndexWriter.applyAllDeletesAndUpdates(IndexWriter.java:3172)
	at org.apache.lucene.index.IndexWriter.maybeApplyDeletes(IndexWriter.java:3158)
	at org.apache.lucene.index.IndexWriter.prepareCommitInternal(IndexWriter.java:2814)
	at org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:2970)
	at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:2935)
	at org.hibernate.search.backend.impl.lucene.IndexWriterHolder.commitIndexWriter(IndexWriterHolder.java:150)
	at org.hibernate.search.backend.impl.lucene.IndexWriterHolder.commitIndexWriter(IndexWriterHolder.java:163)
	at org.hibernate.search.backend.impl.lucene.PerChangeSetCommitPolicy.onChangeSetApplied(PerChangeSetCommitPolicy.java:29)
	at org.hibernate.search.backend.impl.lucene.AbstractWorkspaceImpl.afterTransactionApplied(AbstractWorkspaceImpl.java:98)
	at org.hibernate.search.backend.impl.lucene.LuceneBackendQueueTask.applyUpdates(LuceneBackendQueueTask.java:108)
	at org.hibernate.search.backend.impl.lucene.LuceneBackendQueueTask.run(LuceneBackendQueueTask.java:47)
	at org.hibernate.search.backend.impl.lucene.SyncWorkProcessor$Consumer.applyChangesets(SyncWorkProcessor.java:167)
	at org.hibernate.search.backend.impl.lucene.SyncWorkProcessor$Consumer.run(SyncWorkProcessor.java:153)
	at java.lang.Thread.run(Thread.java:745)

The relevant config, shared by the above 3 indexing methods is:

hibernate.search.indexing_strategy = manual
hibernate.search.default.locking_strategy = none
hibernate.search.exclusive_index_use = true

Any ideas what could be going on here?

In the index getting corrupted because 2 processes are accidentally writing to the same index (and the subsequent read errors are on account of that corruption)?

0 Likes

#2

Would setting hibernate.search.exclusive_index_use = false fix this problem?

0 Likes

#3

It could, but it would also likely decrease performance, maybe to a point it’s unusable.

Why are you writing to the same index through different processes (I assume different instances of Hibernate Search)? This whole setup could work fine with a single Hibernate Search instance.
If it’s a problem of connection pool, connection pools generally have options to configure a minimum and maximum number of connections, so you would not have to use many connections all the time, just when mass indexing.

0 Likes