We have huge indexed transaction tables that embed indexes of reference tables, and indexes for some of the most frequently used reference tables also contain embedded indexes which change continuously. There is limited dependency shared between the embedded properties in the reference tables which change frequently, and those that are embedded in the transaction tables which do not.
In Hibernate Search 5, there appears to be no optimization or detection of this use case in place. As a result, a change to an index embedded in a reference table spirals off into a huge and unnecessary amount of automatic reindexing of the transaction tables and their respective embedded indexes that produce no useful changes. It becomes painfully slow to the point we have to limit our use of Hibernate Search in order to have a modicum of productive performance.
In Hibernate Search 5 there seems to be an assumption being made that a change to any indexed property must reindex every @ContainedIn association to an infinite depth up the chain, even though the developer can filter properties via includePaths on the other side, such that the reason for the refresh of the embeddable index actually has no affect on a given container.
I am hoping Hibernate Search 6 will begin using includePaths of @IndexedEmbedded intelligently, to avoid traversing an association during automatic reindexing, if there will be no useful change made. I would like to confirm this has been implemented in the next release.
What I am looking for could be described as a @ContainedIn annotation at the property level, with a list of applicable inverse associations to which it applies, instead of having a global @ContainedIn on the inverse association applied to every property, (included or not).
The same could be accomplished by using the includePaths of @IndexedEmbedded on the forward association to identify when we can safely stop traversing the hierarchical tree. I understand this is a complex dependency calculation, but the performance gained by implementing this will far outweigh the development effort.
The denormalization and duplication of all needed relations into a single index, to avoid having to fully implement a “join” feature between indexes in the underlying search engines, has caused the need for this highly pessimistic refresh behavior, but presently it is too pessimistic. (I do believe that from a technical standpoint, the possibility must exist to implement an index join feature in lucene, but that is another topic.)
I see that Hibernate Search 6 is starting to automatically determine the inverse side of @IndexedEmbedded without a @ContainedIn, and that more indexing can be triggered through @IndexingDependency, which is nice, but we also need new ways to achieve less automatic reindexing across @IndexedEmbedded/@ContainedIn pathways.
Please let me know if what we are looking for will be or is already implemented, or if an enhancement request is needed for Hibernate Search 6. Thank you!