Query execution extremely slow after upgrading to Hibernate ORM 6.2

I know :frowning: In the meantime I invested many hours to investigate including great help from Christian and that’s the reason I have new info comparing to initial post.

I’ll investigate more because currently I have working project in the production without bugs (at least no bugs found) so I can take a break from development and use this time to help us all in order to get to the root of the problem.

Well, this is true that I found a way to avoid initial behaviour, but I did cut off pretty important part of indexing by removing @IndexedEmbedded' annotation and on the other side any references to Manufacturer` which makes search less usable. I mean this was just to be sure what is the cause, not final solution that I can use.

Right now I can offer:

  1. AnyDesk session to my development machine,
  2. VBox Ubuntu with the same environment (26 GB download),
  3. BitBucket access to project.

I can try to make a copy of project and start to remove things, but it could take days and days because there are so many classes and many of those depend on each other.

I’m sorry but I cannot do that.

As you can imagine, if this takes you days, this will take me days.

If you nevertheless manage to reduce your reproducer to a size that can be shared as a simple project, I’ll gladly have a look.

1 Like

@beikov Just to put an update on this:
Today I invested 7 more hours and created runnable reproducer.
Download URL is provided to @yrodiere in private message to keep some privacy for my classes :rofl:

2 Likes

Thank you @horvoje, this helped a lot.

I sent a (much) simplified reproducer here: Reproducer for embeddable being always marked as dirty with ORM 6.2 by yrodiere · Pull Request #261 · hibernate/hibernate-test-case-templates · GitHub

To sum up: one of your embeddables (TextId) is always considered dirty by Hibernate ORM 6.2, even if it didn’t change. As a result Hibernate Search triggers reindexing all the time.

This may be a bug, or this may be something that you need to change in your code: I know there were improvements regarding mutability and perhaps dirty checking in ORM 6.2, and the new behavior may be related to your custom converter from Map to String (LocaleStringArrayMapConverter).

@beikov , could you please have a look at the reproducer and give us your opinion?

If it’s not a bug, perhaps an entry in the migration guide would make sense?

2 Likes

Great news! Just for the test I did changes to TextID class and made it as more immutable as I could:

  • removed setters
  • allowed to be instantiated and filled only by a constructor,
  • the map created by constructor is instance of UnmodifiableMap to put another 2 cents to immutability,
  • no-args constructor is mandatory by Hibermate but Hibernate uses reflection I guess because there are no setters; also class properties cannot be final because of this constructor which prevents me to make it fully immutable.

Now it runs much faster with the same data set. Previously it was about a minute and above, and now it takes about 5 seconds. There are still more articles reindexed than expected (1), but now I have to go through all my classes in model and check is there any other mutable thing making trouble to Hibernate Search.

@beikov @yrodiere Thanks a million for your time and effort!

P.S. New version of TextID class is uploaded to my homepage and URL is provided to Yoann in private message just for the case any of you want to check the difference. I’ll make all needed changes to initial reproducer if needed to use this new class.

P.P.S. I didn’t make any change in LocaleStringArrayMapConverter for now.

Have a nice weekend!

It’s me again with more info :slight_smile:
Now after fixing TextID and having remarkable less unneeded indexing, I found out the connection (except TextID which is now fixed) which causes reindexing: Provider entity and Manufacturer entity.
First item on the list is the one being updated. The provider for this item is “Wolt Zagreb”.
All other items that does not need reindexing has the same Provider in their provider list.

And the specific thing with Provider is that it has many annotations as a class property:

When I remove a price from given article which means no Provider is given, then all articles from the same Manufacturer are indexed.

@IndexedEmbedded @Reference @JsonBackReference @ManyToOne
private Provider provider;

@ManyToOne(fetch = FetchType.LAZY)
@IndexedEmbedded
@AssociationInverseSide(inversePath =
    @ObjectPath(@PropertyValue(propertyName = "articles"))
)
private Manufacturer manufacturer;

Note: Few minutes ago tried with 6.2.2 and it has the same “bug” - if we are allow to call it a bug.


Log snippet while updating article “Tapir 0,33 l”:

### Article route(): Indexing ID 21104 'Tapir 0,33 l' [Wolt Zagreb]
### Article route(): Indexing ID 21094 'Tortilja valja zbilja!' [Wolt Zagreb, Wolt Zagreb, Glovo Zagreb]
### Article route(): Indexing ID 21095 'Njoke, vi�e od OK!' [Glovo Zagreb, Wolt Zagreb]
### Article route(): Indexing ID 21096 'Pastu pesto �elim ?esto!' [Wolt Zagreb, Glovo Zagreb]
### Article route(): Indexing ID 21097 'Bolonjez, mesa bez!' [Glovo Zagreb, Wolt Zagreb]
### Article route(): Indexing ID 21098 'Ave, Cezar Tofu!' [Wolt Zagreb, Glovo Zagreb]
### Article route(): Indexing ID 21099 'Sushirrito se naljutio!' [Wolt Zagreb, Glovo Zagreb]
### Article route(): Indexing ID 21100 'Sushirrito se zazelenio!' [Wolt Zagreb, Glovo Zagreb]
### Article route(): Indexing ID 21101 'Tortilja vege izobilja!' [Wolt Zagreb]
### Article route(): Indexing ID 21102 'Chia desert, Golden milk ekspert!' [Wolt Zagreb]
### Article route(): Indexing ID 21103 'Kornja?a 0,33 l' [Wolt Zagreb]
### Article route(): Indexing ID 15947 'Zagi ?izi' [Wolt Zagreb, Sve najbolje]
### Article route(): Indexing ID 18653 'Margherita verde (large)' [Wolt Zagreb, Caruso catering & pizza]
### Article route(): Indexing ID 15946 'Zagi salama' [Sve najbolje, Wolt Zagreb]
### Article route(): Indexing ID 18651 '�okica (velika)' [Wolt Zagreb, Caruso catering & pizza]
### Article route(): Indexing ID 18652 'Samoborka (large)' [Wolt Zagreb]
### Article route(): Indexing ID 16194 'Zagi ?izi gurme' [Sve najbolje, Wolt Zagreb]
### Article route(): Indexing ID 18648 'Rambo Vegan (large)' [Wolt Zagreb, Caruso catering & pizza]
### Article route(): Indexing ID 15945 'Zagi kebab' [Sve najbolje, Wolt Zagreb]
### Article route(): Indexing ID 2059 'Tofu burger' [Vegehop, Glovo Zagreb, Wolt Zagreb]
### Article route(): Indexing ID 16195 'Vege purger' [Wolt Zagreb, Sve najbolje]
### Article route(): Indexing ID 18650 'Patron (large)' [Wolt Zagreb]
### Article route(): Indexing ID 18654 'Verde funghi (large)' [Wolt Zagreb]
### Article route(): Indexing ID 15944 'Zagi burger' [Sve najbolje, Wolt Zagreb]

I did changes to TextID class and made it as more immutable as I could:

If it truly is immutable, try annotating it with @Immutable (the Hibernate annotation). The Hibernates should not send any change event for this type, and hopefully this will remove a bit more indexing ?

1 Like

I did this as you said and also did the same to all my embeddables - annotated with @org.hibernate.annotations.Immutable, removed all setters and made them to use constructor or builder; any possible internal map is created as unmodifiable - and amount of reindexing reduced. Not to just one entioty redindexed as it was the case with 6.1.7 but at least I can see some logic and track what’s going on in contrary to my first post when almost all entities in all repositories were reindexed by a single entity update.

Provider was marked with '@Embeddablewhich was (I guess) wrong way to make back reference to@IndexedEmbedded; this annotation is removed from Provider` making it regular entity.

The difference comparing to 6.1.7 is more clear - entities that have @IndexedEmbedded are causing additional indexing. To be more detailed, two of them: Manufacturer which is direct property of Article and Provider which is property of PriceInfo which is then collection member inside Article.

I’m not sure am I supposed to make Provider and Manufacturer immutable too because they are “independent” entities having their own repositories and they have properties which are set later in process of creation.

When I do the update of those entities, first step is fetch by ID to get managed item, then use setters to set fields that changed and then persist with repository save() method. If they were immutable, I’m not sure what would be the best way to do updates. But of course, if that’s the cost to get 6.1.7 behaviour, I’ll learn how to make needed changes to them.

P.S. The main question now is about “dirty check” - it entity is not @Immutable, should Hibernate reindex it always or not. If the answer is yes, then I’ll find another way to update. If the answer is no, then I can wait as long as needed it’s fixed.

If the entity is not really dirty, then there should not be a need for reindexing. Something that we changed in 6.2 is how we treat converted types a.k.a. @Convert/AttributeConverter, as some users reported that dirty/mutated values of such types were not picked up by Hibernate. There is a chance that there is still a bug somewhere, but I’ll look into the test case that Yoann provided to see what is happening.

1 Like

Thanks!

No, you should not mark entities as immutable if they are mutable.

Hibernate Search should reindex your entities when data used to produce the resulting indexed document changed. There are limitations, but that’s the crux of it.

If you want an exact description of what will get reindexed by Hibernate Search and when, set the log level for Hibernate Search to DEBUG: you’ll get a description of the reindexing resolvers logged on startup. Have a look at the one for Manufacturer and look for “articles” in there.

Of course. That’s why I used “make”, not “mark”. I only use this annotation on real immutables that are created through constructors only and if they have a map, this map is created inside constructor as unmodifiable map.
For entities mentioned there is way to change the code and make them immutable, but in that case I’d get much more code and many “hacks” as I set their properties not all at once but in 2 or even 3 stages (TextID is good example because you cannot create TexID unless you have name already set; to get TextID I need search service and/or repository to check is there any other entity with the same ID; having Text ID created in constructor would be hard - I mean I could send a link to a service into constructor, but I’m not happy with that solution).
I’ll definitely wait to see the final result of this. If this pops up as a bug, than I won’t have to change anything - just wait for the next release. In other case, I’ll see what I have to do to make my code be compliant wtih 6.2.

1 Like

FYI
Few minutes ago tried to upgrade to 6.2.3 and it’s the same situation as with 6.2.1 and 6.2.2 - not fixed, almos every entity becomes “dirty” and being reindexed.

I didn’t look into the test case yet, so yeah, that’s expected. I will try to do so this week though.

I looked into the test case and during the dirty checking of two TextId instances which have the same logical contents, the equals check returns false, because textIdMap contains an array and arrays don’t work with equals. I quickly tested the same thing with ORM 5 and it failed the same way.

In 6 you can either switch to a type that has a proper equals implementation i.e. ArrayList, or create a custom BasicJavaType for your Map<Locale, String[]>, which you can then annotate on the field via @JavaType(LocaleStringMapJavaType.class).

2 Likes

Thanks! Any idea why does it work with 6.1.7? Or is it just a lack of checks in 6.1?

I’ll try to make changes you proposed and get back here to put an update.

Thanks! Any idea why does it work with 6.1.7? Or is it just a lack of checks in 6.1?

We had a bug where some types were never considered dirty. Might be related to that.

2 Likes

You nailed it! Thanks a million!
Just right now I modified TextIdto use ArrayList and then switched to 6.2.3 and it’s working.
Right now I can’t be sure if everything is updated as expected because I had to implement my own reindexing logic on some critical points, but for sure it’s not slow anymore.

2 Likes