I’ve extracted Openbravo Hibernate’s model and created a test case checking retained heap on SessionFactory with legacy and dynamic batch fetch styles with different (10 and 50) batch sizes.
Thanks for taking the time to prepare the test case, it helped a lot validating we were on the right path. And fixing that will probably help a lot of our users.
the first one was not risky at all so we also included it into the next 5.2: it reduces the memory used by the LoadPlan based entity loaders (you have 13 of them with a batch size of 50) by sharing what could be shared: it reduces the memory used from 1 GB to ~470 MB in your case
the second one was only applied to 5.3 and lazy load the per lock mode loaders (it loads eagerly the two most commonly used and loads lazily the others). This accounts for the rest of the gain but obviously if you are using some exotic lock modes, more memory will be used once they get used in your application. It should help anyway as there is very little chance that you use the 11 lock modes involved for all your entities.
@caristu The best way to control the amount of data you fetch is at query-time because the fetch strategy is given by the current business user requirements.
Default batch fetching strategy or mapping-time constructs like @Subselect, @BatchSize or @LazyCollection are like applying a band-aid on a broken foot. They don’t really fix the actual problem, they only bring some relief in the short-run.
If performance is important to you, you want to:
switch to LAZY fetching for all association types,
use entities only if you want to later modify them and benefit from optimistic locking or optimistic locking
use DTOs for read-only projections (e.g. trees, tables, reports)
avoid anti-patterns like Open-Session in View or enable_lazy_load_no_trans
If you do that, you will see that there is no need for stuff like @Subselect or @BatchSize, and, not only that you’ll get better memory utilization on the JVM size, but you avoid lots of processing on the DB side as well (CPU, memory, IO) as well on the networking layer.