Using session.get() API returning incorrect results

I have a bi-directional many to many assocation. Movies ↔ MovieEmployees ↔ Employee

@Entity
@Table(name = "movies")
public class Movie {
	@EmbeddedId
	private MoviePrimaryKey moviePrimaryKey;
	
	@OneToMany(
			mappedBy = "movie",
			orphanRemoval = true,
			fetch = FetchType.EAGER		
		)
	List<MovieEmployee> employees = new ArrayList<>();	
@Entity
@Table(name = "movies_employees")
public class MovieEmployee {
	@Id
	@ManyToOne(fetch = FetchType.LAZY)
	private Movie movie;
	@Id
	@ManyToOne(fetch = FetchType.EAGER)
	private Employee employee;
	@Column(name = "job_position")
	private String jobPosition;
	@Column
	private String role;
	
	public MovieEmployee() {
@Entity
@Table(name = "employees")
public class Employee {
	@EmbeddedId
	private EmployeePrimaryKey employeePrimaryKey;
	@Column(name = "birth_date")
	private LocalDate birthDate;
	@Column(name = "country_of_origin")
	private String countryOfOrigin;
	private String synopsis;
	@Column(name = "popularity_ranking")
	private Integer popularityRanking;
	@Column(name = "img_location")
	private String imgLocation;
	
	@OneToMany(
			mappedBy = "employee",
			orphanRemoval = true,
			fetch = FetchType.LAZY
		)
	List<MovieEmployee> movieEmployees = new ArrayList<>();

Using the .get API

		Session session = sessionFactory.openSession();
		Transaction transaction = session.beginTransaction();
		Movie movie = session.get(Movie.class, moviePrimaryKey);
		session.close();

The returned movie will contain duplicate MovieEmployee objects in the employees field in the Movie entity. I’ve turned on SQL statements used and verified that statement against the database and it does NOT return any duplicates. Any idea why I would get duplicate values? The sql statement is as follows:

Hibernate: 
    select
        m1_0.original_title,
        m1_0.release_date,
        m1_0.alternative_title,
        m1_0.box_office,
        m1_0.budget,
        m1_0.hook,
        m1_0.img_location,
        m1_0.motion_picture_rating,
        m1_0.motion_picture_rating_desc,
        m1_0.running_time,
        m1_0.score,
        m1_0.synopsis,
        m1_0.themes,
        e1_0.movie_original_title,
        e1_0.movie_release_date,
        e1_0.employee_identical_name_id,
        e1_0.employee_name,
        e2_0.identical_name_id,
        e2_0.name,
        e2_0.birth_date,
        e2_0.country_of_origin,
        e2_0.img_location,
        e2_0.popularity_ranking,
        e2_0.synopsis,
        e1_0.job_position,
        e1_0.role 
    from
        movies m1_0 
    left join
        movies_employees e1_0 
            on m1_0.original_title=e1_0.movie_original_title 
            and m1_0.release_date=e1_0.movie_release_date 
    left join
        employees e2_0 
            on e2_0.identical_name_id=e1_0.employee_identical_name_id 
            and e2_0.name=e1_0.employee_name 
    where
        (
            m1_0.original_title, m1_0.release_date
        ) in ((?, ?))

Are you sure you updated Hibernate to the latest version 6.6.4? If so, and you still have the problem, please create an issue in our issue tracker with a test case template that reproduces the issue.

Before I do that quick question - does the join table’s (in this case MovieEmployee) primary key need to be a unique identifier? In this case the pk for the two duplicate rows are identical. The non primary key columns is what differentiates them.

Of course, the columns for the fields annotated with @Id must form a unique tuple i.e. as far as I understand your model, you need primary key (movie_original_title, movie_release_date)

Ok so I was able to fix my issue but I don’t really understand 100%. My junction table in my database didn’t have a primary key constraint, and my JPA annotated class, as you mentioned has its’ key as (movie_original_title, movie_release_date). I have two rows in my junction table that have the same primary key. When I run the sql query that hibernate runs, I do see all the correctly returned rows, but it appears that when hibernate converts those returned rows it is unable to resolve the two rows with identical primary keys. Any idea why that is?

The solution for me was adding a primary key constraint to the junction table in my database, and adding the additional columns or “role” and “jobPosition” with @Id annotations so that from the perspective of the entities each entity in the junction table will have unique primary keys.

There is nothing to “convert”, Hibernate ORM simply treats every row that is returned as an element for a collection fetch. Since you’re using a list, you get duplicates. If you don’t want duplicates because the join condition is based on a unique key (which it should), then you should use a Set instead.

Let me clarify before my “fix” that the list had a row completely missing, and had instead replaced this missing row with a duplicate row. This row that was duplicated had the same pk as the one that was missing (i.e. the junction table did not enforce uniqueness for the pk)

So adding the primary/unique key constraint failed initially because of duplicate rows? In that case, this behavior is expected.

Let me clarify.

The get API uses this query for my movie object

Hibernate: 
    select
        m1_0.original_title,
        m1_0.release_date,
        m1_0.alternative_title,
        m1_0.box_office,
        m1_0.budget,
        m1_0.hook,
        m1_0.img_location,
        m1_0.motion_picture_rating,
        m1_0.motion_picture_rating_desc,
        m1_0.running_time,
        m1_0.score,
        m1_0.synopsis,
        m1_0.themes,
        e1_0.movie_original_title,
        e1_0.movie_release_date,
        e1_0.employee_identical_name_id,
        e1_0.employee_name,
        e2_0.identical_name_id,
        e2_0.name,
        e2_0.birth_date,
        e2_0.country_of_origin,
        e2_0.img_location,
        e2_0.popularity_ranking,
        e2_0.synopsis,
        e1_0.job_position,
        e1_0.role 
    from
        movies m1_0 
    left join
        movies_employees e1_0 
            on m1_0.original_title=e1_0.movie_original_title 
            and m1_0.release_date=e1_0.movie_release_date 
    left join
        employees e2_0 
            on e2_0.identical_name_id=e1_0.employee_identical_name_id 
            and e2_0.name=e1_0.employee_name 
    where
        (
            m1_0.original_title, m1_0.release_date
        ) in ((?, ?))

which correctly returns (from my database)


Note that in the above screenshot the first two rows have same employee name. In this case Parker Finn is both a writer and director. In below junction table (pre-fix), the @Id annotation for this class is the embedded id for Movie, and Employee. These two rows have the same key in this case. This is where I believe the issue is - however you had mentioned that whatever the database returns from the query is what hibernate will load in the domain class. However what I got was row 1 duplicated.

Pre-fix my junction table looked like below. Not to mention, I did not add a PK to the junction table in the sql database (not sure if that matters).

@Entity
@Table(name = "movies_employees")
public class MovieEmployee {
	@Id
	@ManyToOne(fetch = FetchType.LAZY)
	private Movie movie;
	@Id
	@ManyToOne(fetch = FetchType.EAGER)
	private Employee employee;
	@Column(name = "job_position")
	private String jobPosition;
	@Column
	private String role;
	
	public MovieEmployee() {

Post fix my java junction class looks like below. Now for the duplicated rows for Parker Finn, job position was different, so now i’ve added the @Id annotation for that field, as well as created the same PK for this table in my sql database. The correct domain objects are now being returned from hibernate (no more duplicates).

@Entity
@Table(name = "movies_employees")
public class MovieEmployee {
	@Id
	@ManyToOne(fetch = FetchType.LAZY)
	private Movie movie;
	@Id
	@ManyToOne(fetch = FetchType.EAGER)
	private Employee employee;
	@Id
	@Column(name = "job_position")
	private String jobPosition;
	@Id
	@Column
	private String role;

I really am just curious why my initial implementation was wrong - you had stated that hibernate simply uses whatever the query returns, however that was not the observed behavior (returned row 1 in the screenshot, twice).

By using the correct mapping (i.e. matching the database schema) you get the correct results.
I guess you had a duplicate before, because Hibernate ORM assumed for a row that it already is contained in the persistence context and just reused that object, whereas now you get two distinct objects with different job positions and roles.