Hibernate with EAGER loading issues two queries instead of one


#1

Hi,

we have a rather strange problem. We’ve tried to find solutions but honestly we couldn’t even create the search phrase that would give us any useful results, so here it is.

We have a table structure like this:

CREATE TABLE `test`.`a` (
  `id` INT NOT NULL AUTO_INCREMENT,
  `some_field` VARCHAR(45) NULL,
  `B_id` INT(11) NULL,
  `C_id` INT(11) NULL,
  `D_id` INT(11) NULL,
  PRIMARY KEY (`id`));

CREATE TABLE `test`.`b` (
  `id` INT NOT NULL AUTO_INCREMENT,
  `some_field` VARCHAR(45) NULL,
  PRIMARY KEY (`id`));

CREATE TABLE `test`.`c` (
  `id` INT NOT NULL AUTO_INCREMENT,
  `some_field` VARCHAR(45) NULL,
  `D_id` INT(11) NULL,
  PRIMARY KEY (`id`));
  
CREATE TABLE `test`.`d` (
  `id` INT NOT NULL AUTO_INCREMENT,
  `some_field` VARCHAR(45) NULL,
  `E_id` INT(11) NULL,
  PRIMARY KEY (`id`));

CREATE TABLE `test`.`e` (
  `id` INT NOT NULL AUTO_INCREMENT,
  `some_field` VARCHAR(45) NULL,
  PRIMARY KEY (`id`));

The structure looks like this:

       A
       |
---------------
|      |      |
B      C      D
       |      |
       D      E
       |
       E

the problem is with the D-E table structure, which is referenced twice. When we try to eager load this dataset, once hibernate loaded this structure once, it won’t load it again, but stops at joining D without E.

So the query hibernate puts together will look something like this:

SELECT a.some_field, ...,
FROM A as a
LEFT OUTER JOIN B b on a.B_id = b.id
LEFT OUTER JOIN C c on a.C_id = c.id
LEFT OUTER JOIN D d on c.D_id = d.id
LEFT OUTER JOIN E e on d.E_id = e.id
LEFT OUTER JOIN D d on a.D_id = d.id
<<LEFT JOIN to E from A->D is missing here>>

This structure repeats a lot in our original data structure, these are always OneToOne connections, never looping into each other, it’s a clear tree structure where we re-use the same structures over and over again, but it’s never too deep (max join depth is 5), it contains no loops and many times not all connections are used, so the generated query itself is large, but the dataset behind it often contains a lot of empty connections.

Is there a way we could achive to load the whole dataset eagerly? Is there some configuration variable we are not aware of, or is this a bug in Hibernate?


#2

You need to add:

  • the entities,
  • the data access code and
  • the Hibernate logs

It’s not clear what the problem is now.


#3

Hi,

we’ve uploaded an example project to github, you can find it here: Example java project

The interesting part in the log is (the full log is available in the github project here: logfile):

1802 [main] INFO main  - GET ENTITY
1814 [main] DEBUG org.hibernate.SQL  - select a0_.`id` as id1_0_0_, a0_.`B_id` as B_id3_0_0_, a0_.`C_id` as C_id4_0_0_, a0_.`D_id` as D_id5_0_0_, a0_.`some_field` as some_fie2_0_0_, b1_.`id` as id1_1_1_, b1_.`some_field` as some_fie2_1_1_, c2_.`id` as id1_2_2_, c2_.`D_id` as D_id3_2_2_, c2_.`some_field` as some_fie2_2_2_, d3_.`id` as id1_3_3_, d3_.`E_id` as E_id3_3_3_, d3_.`some_field` as some_fie2_3_3_, e4_.`id` as id1_4_4_, e4_.`some_field` as some_fie2_4_4_, d5_.`id` as id1_3_5_, d5_.`E_id` as E_id3_3_5_, d5_.`some_field` as some_fie2_3_5_ from `test`.`a` a0_ left outer join `test`.`b` b1_ on a0_.`B_id`=b1_.`id` left outer join `test`.`c` c2_ on a0_.`C_id`=c2_.`id` left outer join `test`.`d` d3_ on c2_.`D_id`=d3_.`id` left outer join `test`.`e` e4_ on d3_.`E_id`=e4_.`id` left outer join `test`.`d` d5_ on a0_.`D_id`=d5_.`id` where a0_.`id`=?
Hibernate: select a0_.`id` as id1_0_0_, a0_.`B_id` as B_id3_0_0_, a0_.`C_id` as C_id4_0_0_, a0_.`D_id` as D_id5_0_0_, a0_.`some_field` as some_fie2_0_0_, b1_.`id` as id1_1_1_, b1_.`some_field` as some_fie2_1_1_, c2_.`id` as id1_2_2_, c2_.`D_id` as D_id3_2_2_, c2_.`some_field` as some_fie2_2_2_, d3_.`id` as id1_3_3_, d3_.`E_id` as E_id3_3_3_, d3_.`some_field` as some_fie2_3_3_, e4_.`id` as id1_4_4_, e4_.`some_field` as some_fie2_4_4_, d5_.`id` as id1_3_5_, d5_.`E_id` as E_id3_3_5_, d5_.`some_field` as some_fie2_3_5_ from `test`.`a` a0_ left outer join `test`.`b` b1_ on a0_.`B_id`=b1_.`id` left outer join `test`.`c` c2_ on a0_.`C_id`=c2_.`id` left outer join `test`.`d` d3_ on c2_.`D_id`=d3_.`id` left outer join `test`.`e` e4_ on d3_.`E_id`=e4_.`id` left outer join `test`.`d` d5_ on a0_.`D_id`=d5_.`id` where a0_.`id`=?
1838 [main] DEBUG org.hibernate.loader.plan.exec.process.internal.ResultSetProcessorImpl  - Starting ResultSet row #0
1840 [main] DEBUG org.hibernate.loader.plan.exec.process.internal.EntityReferenceInitializerImpl  - On call to EntityIdentifierReaderImpl#resolve, EntityKey was already known; should only happen on root returns with an optional identifier specified
1848 [main] DEBUG org.hibernate.engine.internal.TwoPhaseLoad  - Resolving associations for [csat.hibernatetest.jpa.A#10]
1851 [main] DEBUG org.hibernate.engine.internal.TwoPhaseLoad  - Done materializing entity [csat.hibernatetest.jpa.A#10]
1851 [main] DEBUG org.hibernate.engine.internal.TwoPhaseLoad  - Resolving associations for [csat.hibernatetest.jpa.B#20]
1851 [main] DEBUG org.hibernate.engine.internal.TwoPhaseLoad  - Done materializing entity [csat.hibernatetest.jpa.B#20]
1851 [main] DEBUG org.hibernate.engine.internal.TwoPhaseLoad  - Resolving associations for [csat.hibernatetest.jpa.C#30]
1851 [main] DEBUG org.hibernate.engine.internal.TwoPhaseLoad  - Done materializing entity [csat.hibernatetest.jpa.C#30]
1851 [main] DEBUG org.hibernate.engine.internal.TwoPhaseLoad  - Resolving associations for [csat.hibernatetest.jpa.D#41]
1851 [main] DEBUG org.hibernate.engine.internal.TwoPhaseLoad  - Done materializing entity [csat.hibernatetest.jpa.D#41]
1851 [main] DEBUG org.hibernate.engine.internal.TwoPhaseLoad  - Resolving associations for [csat.hibernatetest.jpa.E#51]
1851 [main] DEBUG org.hibernate.engine.internal.TwoPhaseLoad  - Done materializing entity [csat.hibernatetest.jpa.E#51]
1851 [main] DEBUG org.hibernate.engine.internal.TwoPhaseLoad  - Resolving associations for [csat.hibernatetest.jpa.D#40]
1851 [main] DEBUG org.hibernate.SQL  - select e0_.`id` as id1_4_0_, e0_.`some_field` as some_fie2_4_0_ from `test`.`e` e0_ where e0_.`id`=?
Hibernate: select e0_.`id` as id1_4_0_, e0_.`some_field` as some_fie2_4_0_ from `test`.`e` e0_ where e0_.`id`=?
1852 [main] DEBUG org.hibernate.loader.plan.exec.process.internal.ResultSetProcessorImpl  - Starting ResultSet row #0
1852 [main] DEBUG org.hibernate.loader.plan.exec.process.internal.EntityReferenceInitializerImpl  - On call to EntityIdentifierReaderImpl#resolve, EntityKey was already known; should only happen on root returns with an optional identifier specified
1853 [main] DEBUG org.hibernate.engine.internal.TwoPhaseLoad  - Resolving associations for [csat.hibernatetest.jpa.E#50]
1853 [main] DEBUG org.hibernate.engine.internal.TwoPhaseLoad  - Done materializing entity [csat.hibernatetest.jpa.E#50]
1853 [main] DEBUG org.hibernate.resource.jdbc.internal.ResourceRegistryStandardImpl  - HHH000387: ResultSet's statement was not registered
1853 [main] DEBUG org.hibernate.loader.entity.plan.AbstractLoadPlanBasedEntityLoader  - Done entity load : csat.hibernatetest.jpa.E#50
1854 [main] DEBUG org.hibernate.engine.internal.TwoPhaseLoad  - Done materializing entity [csat.hibernatetest.jpa.D#40]
1854 [main] DEBUG org.hibernate.resource.jdbc.internal.ResourceRegistryStandardImpl  - HHH000387: ResultSet's statement was not registered
1854 [main] DEBUG org.hibernate.loader.entity.plan.AbstractLoadPlanBasedEntityLoader  - Done entity load : csat.hibernatetest.jpa.A#10
1854 [main] DEBUG org.hibernate.resource.jdbc.internal.LogicalConnectionManagedImpl  - Initiating JDBC connection release from afterTransaction
1854 [main] INFO main  - ENTITY SUCCESSFULLY RETRIEVED

As you can see, even though we request the whole object structure (eager is default for fetching), it still needs two queries to get it.

Our preferred outcome is that hibernate gets the whole entity with one query.

Let me know if you need anything else. Thanks for the help!


#4

As you can see, even though we request the whole object structure (eager is default for fetching), it still needs two queries to get it.

Nope. There is one query. But it is logged twice because you set both a Log appended as well as the Environment.SHOW_SQL property which should never be set.

Remove this line:

settings.put(Environment.SHOW_SQL, "true");

And you’ll see a single SQL query in the log.

Szívesen


#5

Hi,

this is what we see on SQL side:

2019-02-15T09:49:31.305452Z        2 Query    SET autocommit=0
2019-02-15T09:49:31.323433Z        2 Query    select a0_.`id` as id1_0_0_, a0_.`B_id` as B_id3_0_0_, a0_.`C_id` as C_id4_0_0_, a0_.`D_id` as D_id5_0_0_, a0_.`some_field` as some_fie2_0_0_, b1_.`id` as id1_1_1_, b1_.`some_field` as some_fie2_1_1_, c2_.`id` as id1_2_2_, c2_.`D_id` as D_id3_2_2_, c2_.`some_field` as some_fie2_2_2_, d3_.`id` as id1_3_3_, d3_.`E_id` as E_id3_3_3_, d3_.`some_field` as some_fie2_3_3_, e4_.`id` as id1_4_4_, e4_.`some_field` as some_fie2_4_4_, d5_.`id` as id1_3_5_, d5_.`E_id` as E_id3_3_5_, d5_.`some_field` as some_fie2_3_5_ from `test`.`a` a0_ left outer join `test`.`b` b1_ on a0_.`B_id`=b1_.`id` left outer join `test`.`c` c2_ on a0_.`C_id`=c2_.`id` left outer join `test`.`d` d3_ on c2_.`D_id`=d3_.`id` left outer join `test`.`e` e4_ on d3_.`E_id`=e4_.`id` left outer join `test`.`d` d5_ on a0_.`D_id`=d5_.`id` where a0_.`id`=10
2019-02-15T09:49:31.370746Z        2 Query    select e0_.`id` as id1_4_0_, e0_.`some_field` as some_fie2_4_0_ from `test`.`e` e0_ where e0_.`id`=50

It does run 2 queries, the hibernate log was correct.


#6

I ran your application and that’s what it logs:

5746 [main] DEBUG org.hibernate.internal.SessionFactoryRegistry  - Initializing SessionFactoryRegistry : org.hibernate.internal.SessionFactoryRegistry@322803db
5748 [main] DEBUG org.hibernate.internal.SessionFactoryRegistry  - Registering SessionFactory: 6499ae56-423a-45b1-8af6-5771740b91b6 (<unnamed>)
5749 [main] DEBUG org.hibernate.internal.SessionFactoryRegistry  - Not binding SessionFactory to JNDI, no JNDI name configured
5883 [main] DEBUG org.hibernate.stat.internal.StatisticsInitiator  - Statistics initialized [enabled=false]
5911 [main] INFO main  - GET ENTITY
5925 [main] DEBUG org.hibernate.SQL  - select a0_.`id` as id1_0_0_, a0_.`B_id` as B_id3_0_0_, a0_.`C_id` as C_id4_0_0_, a0_.`D_id` as D_id5_0_0_, a0_.`some_field` as some_fie2_0_0_, b1_.`id` as id1_1_1_, b1_.`some_field` as some_fie2_1_1_, c2_.`id` as id1_2_2_, c2_.`D_id` as D_id3_2_2_, c2_.`some_field` as some_fie2_2_2_, d3_.`id` as id1_3_3_, d3_.`E_id` as E_id3_3_3_, d3_.`some_field` as some_fie2_3_3_, e4_.`id` as id1_4_4_, e4_.`some_field` as some_fie2_4_4_, d5_.`id` as id1_3_5_, d5_.`E_id` as E_id3_3_5_, d5_.`some_field` as some_fie2_3_5_ from A a0_ left outer join B b1_ on a0_.`B_id`=b1_.`id` left outer join C c2_ on a0_.`C_id`=c2_.`id` left outer join D d3_ on c2_.`D_id`=d3_.`id` left outer join E e4_ on d3_.`E_id`=e4_.`id` left outer join D d5_ on a0_.`D_id`=d5_.`id` where a0_.`id`=?
5935 [main] DEBUG org.hibernate.resource.jdbc.internal.ResourceRegistryStandardImpl  - HHH000387: ResultSet's statement was not registered
5935 [main] DEBUG org.hibernate.loader.entity.plan.AbstractLoadPlanBasedEntityLoader  - Done entity load : csat.hibernatetest.jpa.A#10
5935 [main] DEBUG org.hibernate.resource.jdbc.internal.LogicalConnectionManagedImpl  - Initiating JDBC connection release from afterTransaction
5936 [main] INFO main  - ENTITY SUCCESSFULLY RETRIEVED

So, it’s just one SQL query.

If you see 2 queries, it’s because you are doing something else which is not included in that test case.

For more details about how lazy loading works in Hibernate, check out the User Guide.

Months of debugging Hibernate performance issues can save days of reading the documentation.

So, don’t try to skip reading the documentation.


#7

Hi,

if you take a closer look at the query you see that it’s missing entity E from the end of the A->D->E connection. The second query is missing for you because there is no data in the tables so the actual connection is missing from D to E thus Hibarnate doesn’t bother to run the second query (which is good). But when you add actual data (please find script here: SQL data insert) you will see that it will generate two queries.

We have read the documentation, this is exactly why we are here, because it’s not giving us any answers on why this is happening and how to avoid it.


#8

Use this test case template to provide a fully functional example that demonstrates the issue.

I’ll take a look at it afterward.


#9

Hi,

we re-created the issue with the provided test case template (can be found here). However we could not write a proper JUnit test for this case, as we couldn’t do any assert on how many queries are run in the background.

Let me know if you need anything else.

Thanks,
Zoltan


#10

Thanks for the test case. It looks like a bug to me. You should open a Jira issue and attach the hibernate-5 project containing the test case.


#11

Alright, will do that, thanks!