I have an application which processes a large number of entities from a read-only query using a ScrollableResults. Every now and then, it invokes PersistenceContext.clear() to evict entities that it has already processed. This application used to run with 750MB of heap but after upgrading my application framework it now requires 2000MB. The heap dump implicates a large StatefulPersistenceContext.nullAssociations
Map which PersistenceContext.clear() does not release.
I don’t know if this is a bug, because I don’t know if I’m using Hibernate correctly. Is it okay to invoke PersistenceContext.clear() while I’m iterating over the results of a ScrollableResults? If not, what’s the right way to execute a query which returns a large number of entities with constant memory?
Technical Details:
From experimentation, nullAssociations is populated with are @OneToOne mappings of the entity that is returned by the query. These are not the owning side of the relationship and are usually null. The application only runs out of memory because the query contains a “LEFT JOIN FETCH” for these null entities. If I remove the “LEFT JOIN FETCH” the query doesn’t run out of memory but the overall operation takes 30 times longer (20 hours instead of 40 minutes).
From code inspection, it looks like the nullAssociations map is an optimization to reduce queries in the two phase load. The “LEFT JOIN FETCH” in Phase 1 populates it. Then Phase 2 can skip trying to find the associated entity because it knows that it doesn’t exist.
In stepping through both the working and bloated versions in a debugger, the difference is that when StatefulPersistenceContext.clearAssociations() is invoked by StatefulPersistenceContext.initializeNonLazyCollections(), the good version will clear the associations, but the bloated version won’t because loadCounter == 1.
By code comparison, the difference seems to be the addition of beforeLoad/afterLoad invocations in StatefulPersisteneContext.prepareCurrentRow()
final PersistenceContext persistenceContext = getSession().getPersistenceContextInternal();
persistenceContext.beforeLoad();
try {
...
}
finally {
persistenceContext.afterLoad();
}
I don’t know how to use GitHub well, but it looks like this change was made by this PR.
This sets loadCounter==1 when initializeNonLazyCollections() invokes clearNullProperties(), which means it won’t release the nullAssociations property.
In desperation, I hacked my application to invoke PersistenceContext.afterLoad() before invoking ScrollableResults.next() and then invoke PersistenceContext.beforeLoad() to clean up. This manipulates loadCounter so that it’s zero when clearNullProperties is invoked, which tricks it into releasing the nullAssociations map. After this hack, my application seems to work correctly with the same heap usage as before.