We received at work a new dump of database. This one is 2 gigabytes big. After restoring this dump, I attempted to index just one ORM entity : ActePrive.
As a result, below the number of document created in my elastic search backend server :
So I’ve got 51539 documents indexed in all. I immediately check through a sql query :
-- Pure ActePrive
select count(ap.id) from acte_prive ap join acte a on ap.id = a.id where ap.id not in (select ad.id from acte_document ad join acte ac on ac.id = ad.id)
union
-- Pure ActeDocument
select count(ap.id) from acte_prive ap join acte a on ap.id = a.id where ap.id in (select ad.id from acte_document ad join acte ac on ac.id = ad.id);
output :
262575
242293
Here the JPA model ordered by inheritance chain (ActeDocument is the innermost class):
@Entity
@PrimaryKeyJoinColumn(name = "ID")
@Inheritance(strategy = InheritanceType.JOINED)
@Scope(proxyMode = ScopedProxyMode.TARGET_CLASS)
public class ActeDocument extends BasicActeDocument {
}
@MappedSuperclass
public abstract class BasicActeDocument extends ActePrive {
}
@XmlRootElement(name = "ActePrive")
@Entity
@PrimaryKeyJoinColumn(name = "ID")
@Inheritance(strategy = InheritanceType.JOINED)
@Scope(proxyMode = ScopedProxyMode.TARGET_CLASS)
public class ActePrive extends BasicActePrive implements ... {
@MappedSuperclass
public abstract class BasicActePrive extends Acte implements ... {
}
@Entity
@Inheritance(strategy = InheritanceType.JOINED)
@XmlSeeAlso({
ActePrive.class,
ActePublic.class
})
public class Acte extends BasicActe implements ... {
}
@MappedSuperclass
public abstract class BasicActe extends BusinessObject implements ...{
}
public abstract class BusinessObject implements Serializable {
}
I precise that only ActePrive is a @Indexed entity. Acte entity is only embedded for indexation (ActePrive indexes some data of Acte).
Over almost 500 000 records, only ten times less are indexed !
Note : I turned on <logger name="org.hibernate.search" level="trace"/> and no errors seems to be displayed during indexation process.
Before running it, I delete all the indexes of course :
DELETE {{elserver}}/_all
Below a code snippet of ActePrive indexation mapping :
@Component("signatureOrmSearchMappingConfigurer")
public class SignatureOrmSearchMappingConfigurer implements HibernateOrmSearchMappingConfigurer {
private final ActePriveConfigurer actePriveConfigurer;
private final ActeConfigurer acteConfigurer;
....
@Override
public void configure(HibernateOrmMappingConfigurationContext context) {
acteConfigurer.configure(context);
actePriveConfigurer.configure(context);
}
}
@Component
public class ActePriveConfigurer implements ... {
private final DateValueBinder dateValueBinder;
private final ActeRoutingBinder acteRoutingBinder;
public ActePriveConfigurer(DateValueBinder dateValueBinder, ActeRoutingBinder acteRoutingBinder) {
this.dateValueBinder = dateValueBinder;
this.acteRoutingBinder = acteRoutingBinder;
}
@Override
public void configure(HibernateOrmMappingConfigurationContext context) {
ProgrammaticMappingConfigurationContext mapping = context.programmaticMapping();
TypeMappingStep actePriveMapping = mapping.type(ActePrive.class);
//all fields indexes creation here
//DISABLING RoutingBinder to not have any difference with DB
actePriveMapping.indexed();//.routingBinder(acteRoutingBinder);
}
}
@Component
public class ActeConfigurer implements ... {
public ActeConfigurer(DateValueBinder dateValueBinder) {
this.dateValueBinder = dateValueBinder;
}
@Override
public void configure(HibernateOrmMappingConfigurationContext context) {
ProgrammaticMappingConfigurationContext mapping = context.programmaticMapping();
TypeMappingStep acteMapping = mapping.type(Acte.class);
//Declare field indexes here only, no acteMapping.indexed() is done
}
}
This is really low and could result in poor performance. That probably doesn’t cause your problem, though, so let’s move on.
Do you re-create the indexes, though? If you don’t, Elasticsearch may create the indexes automatically and will try to guess the mapping, which it generally guesses wrong.
You may want to use .dropAndCreateSchemaOnStart(true) on the mass indexer, instead of deleting indexes manually. This will re-create the indexes as necessary.
Once again, that probably isn’t related to your problem, though.
I see this in your logs:
2022-04-08 11:15:07,099 c.f.signature.services.business.IndexBS [E] Exception during reindex crpcen
java.lang.NoSuchMethodError: org.hibernate.search.mapper.orm.common.impl.EntityReferenceImpl.<init>(Lorg/hibernate/search/mapper/pojo/model/spi/PojoRawTypeIdentifier;Ljava/lang/String;Ljava/lang/Object;)V
at org.hibernate.search.mapper.orm.mapping.impl.HibernateOrmMapping.createEntityReference(HibernateOrmMapping.java:214)
at org.hibernate.search.mapper.orm.mapping.impl.HibernateOrmMapping.createEntityReference(HibernateOrmMapping.java:77)
at org.hibernate.search.engine.backend.common.spi.EntityReferenceFactory.safeCreateEntityReference(EntityReferenceFactory.java:34)
at org.hibernate.search.mapper.pojo.work.impl.PojoDocumentContributor.contribute(PojoDocumentContributor.java:54)
at org.hibernate.search.backend.elasticsearch.index.impl.ElasticsearchIndexManagerImpl.createDocument(ElasticsearchIndexManagerImpl.java:164)
at org.hibernate.search.backend.elasticsearch.work.execution.impl.ElasticsearchIndexIndexer.index(ElasticsearchIndexIndexer.java:79)
at org.hibernate.search.backend.elasticsearch.work.execution.impl.ElasticsearchIndexIndexer.add(ElasticsearchIndexIndexer.java:44)
at org.hibernate.search.mapper.pojo.work.impl.PojoTypeIndexer.add(PojoTypeIndexer.java:66)
at org.hibernate.search.mapper.pojo.work.impl.PojoIndexerImpl.add(PojoIndexerImpl.java:47)
at org.hibernate.search.mapper.pojo.massindexing.impl.PojoMassIndexingEntityLoadingRunnable$IndexingBatch.startIndexing(PojoMassIndexingEntityLoadingRunnable.java:208)
at org.hibernate.search.mapper.pojo.massindexing.impl.PojoMassIndexingEntityLoadingRunnable$IndexingBatch.startIndexingList(PojoMassIndexingEntityLoadingRunnable.java:155)
at org.hibernate.search.mapper.pojo.massindexing.impl.PojoMassIndexingEntityLoadingRunnable$LoadingContext$1.accept(PojoMassIndexingEntityLoadingRunnable.java:125)
at org.hibernate.search.mapper.orm.loading.impl.HibernateOrmMassEntityLoader.load(HibernateOrmMassEntityLoader.java:49)
at org.hibernate.search.mapper.pojo.massindexing.impl.PojoMassIndexingEntityLoadingRunnable.runWithFailureHandler(PojoMassIndexingEntityLoadingRunnable.java:60)
at org.hibernate.search.mapper.pojo.massindexing.impl.PojoMassIndexingFailureHandledRunnable.run(PojoMassIndexingFailureHandledRunnable.java:32)
Most likely you’re not using the same version of Hibernate Search for all your Hibernate Search dependencies. Make sure your dependencies are consistent.
Maybe, But my DB is MariaDB and I followed recommendations from hibernate search reference saying a tip about MySQL databases :
A note to MySQL users: the MassIndexer uses forward only scrollable results to iterate on the primary keys to be loaded, but MySQL’s JDBC driver will preload all values in memory.
Thanks for the error in the log, it was old trace of a monkey patch (they love monkey patch at my work). After correction I managed to index about 246000 ActeDocument but still no ActePrive. I then look again in logs, and still indexing exceptions appeared like :
2022-04-11 15:37:24,326 o.h.e.internal.DefaultLoadEventListener allegoria [I] HHH000327: Error performing load command
org.hibernate.InstantiationException: Cannot instantiate abstract class or interface: : com.allegoria.notariat.business.MentionPublication
at org.hibernate.tuple.PojoInstantiator.instantiate(PojoInstantiator.java:79)
at org.hibernate.tuple.PojoInstantiator.instantiate(PojoInstantiator.java:105)
at org.hibernate.tuple.entity.AbstractEntityTuplizer.instantiate(AbstractEntityTuplizer.java:705)
at org.hibernate.persister.entity.AbstractEntityPersister.instantiate(AbstractEntityPersister.java:5285)
at org.hibernate.internal.SessionImpl.instantiate(SessionImpl.java:1627)
at org.hibernate.internal.SessionImpl.instantiate(SessionImpl.java:1611)
Until here, I managed to index about 1000 ActePrive, and then it finishes indexation. I understand the errors, but normally, the process should not stop for whatever reason the indexation of a record failed, isn’t it ? I have a class for it :
public class HibernateSearchMassIndexingFailureHandler implements MassIndexingFailureHandler {
public void handle(MassIndexingEntityFailureContext context) {
try {
context.entityReferences().stream()
.map(object -> (EntityReference) object)
// on filtre les enregistrements déjà marqués à re-synchroniser
.forEach(entityReference -> {
routingService.setOutOfWebContextCrpcen(entityReference.tenant());
indexErrorBS.storeIndexError(entityReference);
});
logger.warn("Erreur entité dans l'indexation de masse " + context.failingOperation().toString(), context.throwable());
} finally {
// Oubli du crpcen du scheduling
routingService.setOutOfWebContextCrpcen(null);
}
}
}
For indexing it’s true, but in this case it’s not indexing that failed, it’s loading the entity from the database. You have a serious problem in your Hibernate ORM mapping and should solve this.
I suppose we could also call the mass indexing failure handler for exceptions thrown while loading entities, and continue indexing, because there might also be exceptions caused by temporary failure to communicate with the database. Though that would probably require creating a new Session… Maybe you could open a ticket on JIRA?