Sub-query heap allocation beyond v6.5

We are seeing a very large increase in heap allocation by Hibernate when creating our queries when we migrate beyond 6.5.X. It looks like it may be related to our use of sub-queries. We raised HHH-19240 and see that HHH-19292 looks to be the same issue.

We would have to roughly double our pod RAM to avoid an OOM issue and the related ticket application’s RAM went up seven fold and the example project I posted also uses about 7 times the heap comparing v6.6.11 with v6.5.3 for a query with sparse entity classes and no data.

The memory is allocated the first time each of the queries is encountered and looks to be related to use of ANTLR. And obviously - if these queries are being initialised in parallel the heap required compounds badly.

We tried using a more aggressive ‘compact’ GC algorithm and were able to reduce the in-flight heap to about four times the v6.5.3 levels implying that this heap allocation cannot be substantially mitigated, even if there isn’t a leak.

So my questions are:

  1. Is use of ANTLR in this way a done deal and using sub-queries henceforth will just need much more heap?
  2. Will any changes in 7.X affect this increased memory requirement for better or worse where sub-queries are in use?

Would appreciate any input on this. Hibernate is a very important library for us.

ANTLR is the library for parsing on Java, so yeah, in that sense it’s a “done deal”. We’re not going to migrate away from it just because of a slight regression.

We are well aware of the issues and will work on them once we find the time. I can only refer you to this very well written piece of text describing our situation: Huge Project, Small Team · hibernate/hibernate-orm Wiki · GitHub

If Hibernate ORM and especially this issue is so important to you, please consider contributing. If you’re a Red Hat customer, you can open customer support tickets which we give our highest priority. You can also try to analyze the problem and provide a PR to fix this issue or pay someone to do it for you. If these options are don’t fit your needs, you will have to wait until the Hibernate teams priorities allow looking into this matter.

Thanks for getting back and I completely understand you are being smashed. It is amazing what your teams achieves.

I have added some detail to the ticket based on my investigations which hopefully will help. There is a lot of context I’m not across, but it looks like it’s parsing of a few new keywords which is causing this - potentially the G4 syntax becoming more ambiguous for the predicate. Hope this helps.