Hibernate partial flushes take a lot of time


#1

Hello,
After changing persistence framework in a legacy system from Kodo to Hibernate I’m facing problems with its performance. The system batch processing is now much slower (about 10 times). I tried to find where exactly is the problem and session metrics show that partial-flushes take most of this time:

2018-03-13 14:12:34,196 [INFO] org.hibernate.engine.internal.StatisticalLoggingSessionEventListener: Session Metrics {
102965 nanoseconds spent acquiring 16 JDBC connections;
207340 nanoseconds spent releasing 16 JDBC connections;
396001847 nanoseconds spent preparing 1998 JDBC statements;
3539973540 nanoseconds spent executing 1768 JDBC statements;
91400409 nanoseconds spent executing 231 JDBC batches;
58822133 nanoseconds spent performing 425 L2C puts;
10042168 nanoseconds spent performing 829 L2C hits;
16763776 nanoseconds spent performing 911 L2C misses;
456234070 nanoseconds spent executing 17 flushes (flushing a total of 44798 entities and 185029 collections);
18485026674 nanoseconds spent executing 935 partial-flushes (flushing a total of 2437061 entities and 2437061 collections)
}

There is no SQL queries sent to the database during the partial flushes so why does it take so much time? Could you explain what exactly partial flush does and how to avoid such problems?


#2

You are storing way too many objects in the Hibernate Session. Why do you need to have 45k entities and 185k collections stored in the current Persistence Context?

It takes a lot of time to fetch that data and to flush it as well.

You should use pagination and smaller batch increments.


#3

Thank you for your reply. I understand that this is a very bad idea to store so many objects in the Persistence Context. However, this is a legacy system and it has worked this way for many years with Kodo persistence framework, which we want to replace with Hibernate.
The system is a huge monolith and changing its logic is a very hard and risky task. I am facing the performance problems when executing batch jobs that need so many objects and currently I see no option to limit that needs without changing the batch job logic massively.
Is there a way to limit only the partial flushes execution time? I have no problems with fetching that data from database or full flushes - these take much time too, but that is acceptable for us, we experienced that using Kodo as well. The only thing we would love to avoid is what the last line of Session Metrics is about - very long-lasting partial flushes.


#4

You can avoid partial flushes if you switch to FlushMode.MANUAL.

You can also use the bytecode enhancement dirty checking mechanism to speed up the flush.