Careful: Query.setMaxResults() can cause huge memory usage in ListResultsConsumer (initialCollectionSize) (Hibernate 6.x)

I just want to bring up a problem I had and how I solved it. I am not sure if this is a problem in Hibernate which needs addressing or more my own stupidity. But at least I want to make you aware.

Problem

After upgrading from Hibernate 5.6.15.Final to 6.6.5.Final we noticed that a background batch processing job suddenly had Heap usage peaks (up to 3GB) which did not exist before (Screenshot #1). This background job periodically iterates over a list of some larger main-objects and does stuff like calculating statistics for each item.

Screenshots (4 screenshots in one picture, because I can only post one image as a new user)

After long unsuccessful debugging sessions I had the idea to use Java’s Do Nothing GarbageCollector to force a OutOfMemory and automatically create a heapdump (because we didn’t manage to create a proper heapdump with normal GC which always had cleaned up the stuff already)

In that heapdump I noticed lots of Object arrays which had a reference to org.hibernate.sql.results.spi.ListResultsConsumer,

all containing 1.048.576 items

(Screenshot #2)

all (or a lot) of them null:

(Screenshot #3)

This led me to this code:

/**
	 * Let's be reasonable, a row estimate greater than 1M rows is probably either a mis-estimation or bug,
	 * so let's set 2^20 which is a bit above 1M as maximum collection size.
	 */
	private static final int INITIAL_COLLECTION_SIZE_LIMIT = 1 << 20;

final int initialCollectionSize = Math.min( jdbcValues.getResultCountEstimate(), INITIAL_COLLECTION_SIZE_LIMIT );

in

This was the place where a new ArrayList(1048567) was created.

public Results(JavaType<R> resultJavaType, int initialSize) {
			this.resultJavaType = resultJavaType;
			this.results = initialSize > 0 ? new ArrayList<>( initialSize ) : new ArrayList<>();
		}

But why?

Here maybe comes the culprit (my code… shortened, but something like this)

    Query query = entityManager.createQuery(queryString);
    query.setFirstResult(0);
    query.setMaxResults(Integer.MAX_VALUE); // !!!! THIS is the cultprit !!!!!!!

    for (int i = 0; i < params.length; i++) {
        query.setParameter(i + 1, params[i]);
    }
    return (T) query.getResultList();

Turns out query.setMaxResults() is what is ending up above in jdbcValues.getResultCountEstimate() and causes an initialCollectionSize of 1M if I pass a value larger than 1M (which Integer.MAX_VALUE is)

In the past I have used Integer.MAX_VALUE basically to say “unlimited”.
Probably I am using setMaxResults() incorrectly, but < Hibernate 6 we did not have heap usage issues, so this code was there for years an nobody cared.

Javadoc says:

 /**
     * Set the maximum number of results to retrieve.
     * @param maxResult  maximum number of results to retrieve
     * @return the same query instance
     * @throws IllegalArgumentException if the argument is negative
     */
    Query setMaxResults(int maxResult);

The fix / hack was this (Integer.MAX_VALUE was not set directly, but was passed as a param)

if (maxResults < Integer.MAX_VALUE ) {
            query.setMaxResults(maxResults);
        }

So basically we do not call query.setMaxResults() when Integer.MAX_VALUE is passed.

Happy end

Now the heap spikes are gone and peaks are around 250MB instead of 2-3GB.

(Screenshot #4)

So be careful when using Query.setMaxResults(). I don’t know if Hibernate team should do anything… maybe clarification in javadoc of setMaxResults() … or maybe this ArrayList pre-sizing should be revisited.

Anyway hope this little debugging story is helpful for someone and saves them some time.

Thanks for the insights @crueger, this is definitely an interesting turn of events.

I would say that setMaxResults should outright not be used whenever the user is not intending to limit the number of results returned by a query; this does not seem to be mentioned neither in the javadoc nor in the user guide, so it might be a good idea to add some clarification for it - especially since the recent work with presizing of result collections.

Feel free to open a ticket and, while you’re at it, you could have a go at adding this details to the documentation yourself, it would be a great opportunity to contribute to Hibernate.

Thanks @mbladel
I created HHH-19089 and PR