I’m facing a serious performance problem while trying to persist a root object that contains a large collection (100k objects). My application is ‘stuck’ for 20-30min between the call to myRepo.saveAndFlush(root) and the actual first SQL query in the logs.
Profiling the application show that all the time is spent in StatefulPersistenceContext.getOwnerId, which allocates a lot of temporary objects and triggers 1-2 GC/s. It looks like this method is called because of my bidirectional mapping, and the algorithm is not particularly well suited for ‘large’ collections. It might be the same problem as [HHH-1612] Serious performance lost within IdentityMap... - Hibernate JIRA.
I’m wondering what would be the best way to persist such a hierarchy: object A which contains a List<B> of 100k elements, and each B has a List<C> of < 10 elements.
From what I can see, A and B are connected through a FK on B, so why do you need cascading from A to B? You can simply persist B objects that point to A through the FK.
Having said that, if you have an idea how to improve the performance for your use case, we are happily accepting PRs for that.
I don’t really need cascading from A to B, it just seems easier to persist a whole graph of entities.
I just tried changing to cascade="none" while keeping the <bag>. I’d like bList to be fetched automatically instead of making a manual select + a.setBList().
Now I’m getting a whole other class of problems. After persisting a, Hibernate modifies a.bList and replaces every element with an “empty” instance of B.
When I try to “restore” the original bList like this, I get a ConcurrentModificationException:
void saveA(A a) {
var copyOfBList = List.copyOf(a.getBList());
myRepo.save(a);
a.setBList(copyOfBList);
myRepo.save(a.getBList());
}
if you have an idea how to improve the performance for your use case
Well I don’t have any knowledge of how Hibernate works internally, that’s why I was asking if I was missing something.
I don’t really need cascading from A to B, it just seems easier to persist a whole graph of entities.
This “easyness” comes with a price as you see and although I am sure there are ways to improve the situation, I think that if you need an improvement here, your best way to get it would be to provide a PR for it.
Now I’m getting a whole other class of problems. After persisting a , Hibernate modifies a.bList and replaces every element with an “empty” instance of B .
I guess that by “empty” you might mean that it is replaced with a managed proxy? Don’t worry, this shouldn’t affect your app.
When I try to “restore” the original bList like this, I get a ConcurrentModificationException
Well, it turns out the actual problem had nothing to do with what I thought it was. It felt like I was going the wrong way, so I started from scratch and noticed the Javadoc of StatefulPersistenceContext.getOwnerId states this:
This is performed in the scenario of a uni-directional, non-inverse one-to-many collection (which means that the collection elements do not maintain a direct reference to the owner)
As I explained in my initial post, my relation is bi-directional, so this method should not be called in my case. I set up a “logging breakpoint” in IntelliJ to log the parameters of this function, and quickly found out that the problem is caused by another not-directly-related mapping. I fixed it to be bi-directional, and now I can save 40k objects in 18s.
Thanks for your help, as I kinda expected there was nothing wrong with Hibernate, only with my mapping.