Indexing in Hibernate Search 5.5.8.Final

Good Day!
I am trying to migrate Hibernate Search from 4.2.0.Final to 5.5.8 Final. In Previous version my appication’s behaviour on indexing is fine but with 5.5.8.Final version I observed an issue like when I update an entity , a full indexing started from the database that taking around 7 minutes and causing the application performance. I have used hibernate.search.indexing_strategy as Event. My expectation is on update the indexing should be on that particular entity and it should be quick and that is happening in the previous version 4.2.0.Final.
Is there any new configuration introduced in this new version that I have not set properly. Please advise me on this issue and any configuration that is required to optimize the indexing functionality on this new version 5.5.8.Final.

Plz find the config. what I have used in my application

        <property name="hibernate.search.default.directory_provider" value="filesystem"/>
        <property name="hibernate.search.default.indexBase" value="server/tmp"/>
        <property name="hibernate.search.indexing_strategy" value="event"/>
        <property name="hibernate.default_batch_fetch_size" value="16"/>

Hello @ssaurabhhibernate,

As I already stated there:

Please […] provide more information, in particular the mapping you are using. If you mapping is complex, it would help to provide a reproducer based on our test case templates.

We really need your mapping to diagnose potential performance issues, otherwise we might as well give you random answers.

I noticed you are also having trouble with performance with Hibernate ORM. I would advise you to solve those performance issues first, as a non-optimized ORM mapping will perform poorly both in ORM and in Search.

Hi Yoann,
Thanks…
That ORM issue is identified and now I have observed the hibernate search is taking longer time on each CRUD operation. So my only concern is why hibernate search is doing full indexing every time and taking longer time.

I am using CreditApplication entity as below that uses CreditApplicationIndexingInterceptor class

@Indexed(interceptor = CreditApplicationIndexingInterceptor.class)

public class CreditApplication extends AbstractEntity {

 @ManyToOne(fetch = FetchType.LAZY)
    @JoinColumn(name = "pbe_id", nullable = false)
    @IndexedEmbedded
    public Counterparty getPbe() {
        return pbe;
    }

    public void setPbe(Counterparty pbe) {
        this.pbe = pbe;
    }


@IndexedEmbedded
    @BatchSize(size = 1)//see comment getCollateralItems
    public List<Facility> getFacilities() {
        return facilities;
    }

    public void setFacilities(List<Facility> facilities) {
        this.facilities = facilities;
    }


@ManyToOne(fetch = FetchType.LAZY, cascade = CascadeType.ALL)
    @Fetch(FetchMode.JOIN)
    @JoinColumn(name = "status_id", nullable = true)
    @IndexedEmbedded
    public CreditApplicationStatus getStatus() {
        return status;
    }

    public void setStatus(CreditApplicationStatus status) {
        this.status = status;
    }


@OneToOne(fetch = FetchType.LAZY, cascade = CascadeType.ALL)
    @JoinColumn(name = "credit_application_main_id", nullable = true)
    @IndexedEmbedded
    public CreditApplicationMain getMainInfo() {
        return mainInfo;
    }

    public void setMainInfo(CreditApplicationMain mainInfo) {
        this.mainInfo = mainInfo;
    }


 @ManyToOne(fetch = FetchType.LAZY)
    @JoinColumn(name = "country_of_origin_id")
    @IndexedEmbedded
    public Country getCountryOfOrigin() {
        return countryOfOrigin;
    }

    public void setCountryOfOrigin(Country country) {
        this.countryOfOrigin = country;
    }

@ManyToOne(fetch = FetchType.LAZY)
    @JoinColumn(name = "credit_application_type_id")
    @IndexedEmbedded
    public CreditApplicationType getCreditApplicationType() {
        return creditApplicationType;
    }

    public void setCreditApplicationType(CreditApplicationType creditApplicationType) {
        this.creditApplicationType = creditApplicationType;
    }

CreditApplicationIndexingInterceptor.java

public class CreditApplicationIndexingInterceptor implements EntityIndexingInterceptor<CreditApplication> {

    private static final String[] INDEXED_STATUS = ACTIVE_STATUSES_EXCL_MANAGED.toArray(new String[ACTIVE_STATUSES_EXCL_MANAGED.size()]);


    @Override
    public IndexingOverride onAdd(CreditApplication entity) {
        if (entity.getStatus() != null && !entity.isStatus(INDEXED_STATUS)) {
            return IndexingOverride.SKIP;
        }
        return IndexingOverride.APPLY_DEFAULT;
    }

    @Override
    public IndexingOverride onUpdate(CreditApplication entity) {
        if (entity.getStatus() != null && !entity.isStatus(INDEXED_STATUS)) {
            return IndexingOverride.REMOVE;
        }
        return IndexingOverride.APPLY_DEFAULT;
    }

    @Override
    public IndexingOverride onDelete(CreditApplication entity) {
        return IndexingOverride.APPLY_DEFAULT;
    }

    @Override
    public IndexingOverride onCollectionUpdate(CreditApplication entity) {
        return onUpdate(entity);
    }
}

Will provide more info if required…Thanks.

First, if I remember correctly, fetch = FetchType.LAZY on *ToOne associations will not work out of the box (the association will still be loaded eagerly), unless you use bytecode enhancement. So all those attributes that you marked as LAZY might be loaded every time you change anything to the entity, which would explain the poor performance.

Second, the @BatchSize(size = 1) is very dubious and in my opinion very likely to lead to bad performance. But since you commented it, I guess you know what you’re doing…

Third, you put unconstrained @IndexedEmbeddeds all over the place. Depending on your object graph, this might be fine, or it might be a total disaster leading to half of the database being loaded in memory.
You should check each @IndexedEmbedded to see if you actually need to embed everything, or if some fields can be ignored, which would improve indexing performance. In particular:

  • If you don’t use all the indexed fields of the embedded entities (e.g. CountryOfOrigin.name is indexed and embedded in CreditApplication, but you never use the resulting countryOfApplication.name field when querying CreditApplication), you might want to limit the scope of your @IndexedEmbedded with @IndexedEmbedded(includePaths = { "path1", "path2", ... })
  • If you have deep index-embedding hierarchies (e.g. CreditApplication index-embeds CreditApplicationMain, which index-embeds Credit, which index-embeds CreditProvider, which index-embeds Employee, and so on), you might want to limit the depth of embedding using @IndexedEmbedded(depth = 4) for example.

More generally, I would recommend two things:

  1. Try disabling Hibernate Search and see if the CRUD operations are still slow. If they are, then you should focus on optimizing your Hibernate ORM mapping. If they are not, then it’s probably just an issue of lazy loading (you’ll want to raise the batch size) or of @IndexedEmbedded not being constrained enough (add depth = or includePaths =)
  2. Enable logging of SQL statements and see what happens when you execute one CRUD operation. If there are lots of queries, there must be something wrong in your mapping, either the Hibernate ORM mapping (missing LAZYs) or in the Hibernate Search mapping (nested @IndexedEmbedded that triggers loading of half the database). Note that database logs might help a lot too, depending on your database vendor. But really, that’s too broad a topic to be addressed here.

Also, unrelated: I think a @OneToMany or @ManyToMany is missing on getFacilities.

Thanks Yoann…It was code issue and after optimize and refactor the code as per hibernate5 spec, it has been resolved. Thanks for your assistance …much appreciated:)