Since reporting the issue, major performance improvements have been made and of the 4 test-cases attached to the ticket, 3 are now faster in v6. However, one of the tests is still significantly slower and seems to have major performance implications in our application:
5.6
Main
Fetch Entities with Associations
38.74S
41.60S
Fetch Entities without Associations
23.41S
22.82S
Small Transactions Criteria
1M5.81S
56.42S
Small Transactions HQL
1M7.38S
56.19S
I just re-ran our mass indexer that issues hundreds of thousands of queries to rebuild our search index. Here are the results:
5.6.15: MassIndexerImpl - Finished updating index in 418s.
6.2.0-CR4: MassIndexerImpl - Finished updating index in 459s.
So H6 is roughly 10% slower than H5. A similar difference in performance can be seen in my minimal test case testFetchEntityWithAssociationsPerformance.
Hibernate 6 seems to do significantly more work to initialize associations than H5.
These PRs contain the performance tests against the various branches:
I tried profiling the testFetchEntityWithAssociationsPerformance test against main and 5.6, but there have been so many changes to the code between the two versions that it isnāt possible for me to determine the root cause of these regressions.
Can someone take another look at this? I think H6 has the potential to be significantly faster than H5, but the problem with associations currently makes it slower in our application.
OK, unfortunately Hibernate 6.2.2 is still about 5-10% slower for our mass indexer.
The remaining difference might not be due to slower query performance though. It is likely caused by Hibernate 6 allocating significantly more memory and causing more frequent GC.
This is the memory allocation for a full indexer run in Hibernate 5:
I just ran the tests again against 6.2.4 with the new fixes for query caches. The situation has improved, but only very slightly. Iām afraid that the caching issues are not the root of the regressions either.
Hibernate 5:
2023-06-02 15:27:53,322 [or-4] INF c.s.b.s.s.i.MassIndexerImpl - Finished updating index in 429s.
Hibernate 6.2.4:
2023-06-03 12:30:45,081 [or-4] INF c.s.b.s.s.i.MassIndexerImpl - Finished updating index in 452s.
Cache statistics look much better now, but there still seem to fewer hits than in H5:
I donāt know what queries the MassIndexer emits, but there is one open improvement that might affect the overall performance: [HHH-15086] - Hibernate JIRA
No idea what the best way forward would be at this point. I think you would have to do some debugging to understand what is going on and compare query logs.
It seems a bit odd that with Hibernate 6.2 you have more cache puts than with 5.6. Maybe you can investigate what it is that Hibernate 6.2 puts into the cache that hasnāt been before? Set a break point in org.hibernate.engine.internal.StatisticalLoggingSessionEventListener#cachePutEnd to figure this out. Donāt forget to bump up the transaction timeout for that test.
I just found the time to do some more investigations:
Caching
My first experiment was to disable query and 2nd-level caches in my project to make sure that caching isnāt the root cause of my issues. This are the results:
Hibernate 5.6:
INF MassIndexerImpl - Finished updating index in 445s.
Hibernate 6 (Main):
INF MassIndexerImpl - Finished updating index in 523s.
So without caches, the situation is even worse. Caching actually hid some of the performance issues in my case.
Performance Tests
Then I decided to revisit my original performance tests to add some more lazy-loading and batch fetching, which the mass indexer uses heavily. I was able to reproduce a major difference in performance with a very basic test:
Hibernate 5.6:
Query took 51.71S
Hibernate 6 (Main):
Query took 1M10.13S
These are the PRs containing the tests:
I hope the test-cases are enough to find the root cause of these issues.
It would be interesting to compare how much time is spent flushing in 5.6 vs 6. I donāt think the flushing logic has changed much between 5.6 and 6 so Iām not sure it is the source of the performance regressions.
With the usual caveat that the profiler distorts the results because it adds overhead to every method call.
I do not know Hibernate well enough to see what might be off here. I will have very limited Internet connectivity for a while so I wonāt be able to look into it further for now.
Iāll try to spend some time on this today. Like I wrote initially, it would be best to first look into the queries that are generated and trace back where the queries come from which werenāt generated before.
I think the additional queries are only a minor issue. Iām on vacation for the next 2-3 weeks and will look into it when Iām back.
It would be great if you could profile the test cases I provided. There are some inefficiencies hiding here that add significant overhead compared to Hibernate 5 and Iām pretty sure they are the source of my performance problems. Iāve seen the same hotspots when I profiled the mass indexer on our staging system.
Iāll start off with investigating the additional queries, and when that is done, look into possible code inefficiencies. Additional queries never is a minor thing. Depending on the data/index situation these additional queries can add up a few seconds.
After investigating a bit, I found that the amount of queries is the same. Hibernate 6.2 generates array_contains predicates instead of in predicates in the lazy loading queries, but the performance seems equivalent. The only deviation I could notice is due to the use of a CriteriaQuery.
Hibernate 5 translates the criteria query to HQL and then re-parses that to then finally make use of a cached query plan based on that HQL.
Hibernate 6 on the other hand does not do that anymore, but instead re-translates the query plan. By altering the test to create the TypedQuery only once, the performance is almost the same. The difference is 0.5 - 1 second on my machine. I created [HHH-16782] - Hibernate JIRA to track the query plan caching for criteria queries.
Iāll keep digging to find out what the rest of the performance drop is about and keep you posted.
Small correction, Hibernate 6 re-translates the query all the time even when reusing the TypedQuery. I quickly played around with caching the query plan and the performance obviously improves. On my machine, with caching, this runs 5 seconds faster than Hibernate 5.
Iāll discuss with the team how to continue with this, but either way, please track [HHH-16782] - Hibernate JIRA for further progress. Also, please create a new discourse topic if something else comes up. This one is already way too big