Why are object references duplicated for each row of ResultSet?

For example, given this data model:

@Entity
public class Library {
	@Id
	private Long id;
	
	@OneToMany
	private List<Book> books = new ArrayList<>();
	
	@OneToMany
	private Set<Movie> movies = new HashSet<>();
}

This JPQL:

SELECT l FROM Library AS l LEFT JOIN FETCH l.books LEFT JOIN FETCH l.movies WHERE l.id = :id

will return 10 rows if there are 2 books and 5 movies associated with that Library. The session will create a single Library instance, but return 10 duplicate references. Likewise there will be 2 Book instances, with 5 duplicate references to each one in the books collection. movies will not have duplicates because of Set’s inherent filtering.

I’m curious why Hibernate behaves like this instead of intuitively “boiling” down to unique references. Is this because of a technical reason, or is there a use-case which requires duplicates?

There are cases where it is impossible to eliminate duplicates when a bag join is involved, since bags are unordered and allow duplicates. So Hibernate doesn’t try to be smart and cause you headaches in certain scenarios, but instead just returns you the same cardinality as the JDBC result set. You can tell Hibernate to return unique objects which works by using DISTINCT. Note that the DISTINCT keyword is not necessarily passed to the SQL if you configure this a hint as outlined in the documentation: Hibernate ORM 5.4.29.Final User Guide

DISTINCT de-duplicates the root entity, but it doesn’t affect the child collections. In my example, the ‘books’ list will contain 2 unique objects duplicated 5 times each for 10 references total. Even though I called a valid entity query, I received an incorrect representation of the object state. Maybe I’m misunderstanding something, but this implementation feels misleading. If bag X bag fetches are illegal, then shouldn’t any sort of multi-fetch involving a bag be illegal for consistency?

A simple example to demonstrate the problem:

@Entity
public class Library {
	@Id
	public Long id;
	@ElementCollection
	public Set<String> books;
	@ElementCollection
	public List<String> movies;
}
	
@Test
void test() {
	try (Session session = ...) {
		Library lib = new Library();
		lib.id = 1L;
		lib.books = Set.of("b1", "b2");
		lib.movies = List.of("m1", "m2");
		
		session.getTransaction().begin();
		session.persist(lib);
		session.getTransaction().commit();
	}
	
	try (Session session = ...) {
		TypedQuery<Library> query = session.createQuery(
				"SELECT l FROM Library AS l "
				+ "LEFT JOIN FETCH l.books "
				+ "LEFT JOIN FETCH l.movies", 
				Library.class);
		Library lib = query.getSingleResult();
		
		session.getTransaction().begin();
		lib.movies.add("m3");
		session.getTransaction().commit();
	}
	
	try (Session session = ...) {
		Library lib = session.get(Library.class, 1L);
		lib.movies.forEach(System.out::println);
		// expected: m1, m2, m3
		// actual:	 m1, m2, m1, m2, m3
	}
}

Yeah you are right, this should fail. Please create an issue in the issue tracker(https://hibernate.atlassian.net) with a test case(hibernate-test-case-templates/JPAUnitTestCase.java at main ¡ hibernate/hibernate-test-case-templates ¡ GitHub) that reproduces the issue.

Sorry for the delay, I’ve opened an issue here.

1 Like