Difference between Query#getResultStream() / Query#getResultList() and JOIN FETCH

If I use JOIN FETCH and DISTINCT Hibernate deduplicates the duplicates if I use getResultList() - my probably wrong expectation was that this would hold true for getResultStream() as well.

But as my tests show it is not. So my question are:

  • is expected that hibernate doesn’t do the deduplication also on getResultStream()
  • should i better use getResultStream().distinct().... oder getResultList().stream()...

It was pointed out at the twitter thread I started that this problem goes away by ordering the result by kl.id - this also explains to me that our Unit-Tests did not find this problem because they set up data in away (and I assume most tests do) that the naturaly order in the JOINED-Table matches exactly the one in the “klient”-Table.

I think this makes the behavior worse because there is some sort of deduplication going on but only if the entities are returned in order.

getResultStream() will make use of the JDBC scrolling API which is lazy i.e. fetches rows on demand. De-duplication usually requires that a set of objects is materialized to check for duplicates, which defeats the purpose of streaming, as you should only do streaming if chances are high, that you can’t fit all your results in memory.
I would advise you against using getResultStream() unless you are sure that memory consumption is an issue, as the JDBC scrolling API is usually implemented with database cursors, which comes with other possible issues, especially if the cursor is kept open for long.

I guess vertragsDaten is a collection? If so, then streaming/scrolling won’t work for you anyway, because that kind of requires that the entity cardinality matches the row cardinality i.e. join fetched collections are disallowed.

Yes vertragsDaten is a collection - your conclusion that FETCH JOIN are disallowed sounds reasonable - but I would expect if this is true that Hibernate throws an error or at least logs a warning.

As I stated in my other reply the main problem is that Hibernate does remove duplicates under some circumstances - so there a big likely hood that people get hit by this in production as their JUnit-Test setup most likely creates a system where hibernate deduplicates them.

Hibernate 6 by default does deduplication for queries that select a single entity alias. Actually, it seems like there is special code for scrollable results when collection fetches are involved, though it doesn’t seem to handle deduplication. Also see FetchingScrollableResultsImpl. In that case, I’d classify this as a bug, so please create a JIRA issue and attach a reproducer test case.