Hello,
First of all, great tool and I really appreciate your work! I noticed that for my particular use case there is different SQL query generated for the same HQL query in Hibernate versions 6.5.0 and 6.5.1. I created a test that reproduces this issues and I’ll try to explain why this is problem for me.
I have 2 entities that have bi-directional associacion and a query that fetch joins them and filter by some values.
SELECT p from Parent p LEFT JOIN FETCH p.child c WHERE (c.externalId = :providedId or c.id = :providedId) and p.someFilter = :filterValue
In v6.5.0 this query would result following SQL query:
select
p1_0.id,
c1_0.id,
c1_0.external_id,
p1_0.some_filter
from
parents p1_0
left join
children c1_0
on c1_0.id=p1_0.child_id
where
(
c1_0.external_id=?
or c1_0.id=?
)
and p1_0.some_filter=?
This query uses indexes correctly for OR condition as indicated by explain analyze query:
Nested Loop (cost=8.34..37.26 rows=1 width=104) (actual time=3.158..3.162 rows=1 loops=1)
Join Filter: ((p1_0.child_id)::text = (c1_0.id)::text)
-> Bitmap Heap Scan on children c1_0 (cost=8.34..16.83 rows=5 width=64) (actual time=3.144..3.145 rows=1 loops=1)
Recheck Cond: (((external_id)::text = '1'::text) OR ((id)::text = '1'::text))
Heap Blocks: exact=1
-> BitmapOr (cost=8.34..8.34 rows=5 width=0) (actual time=3.138..3.139 rows=0 loops=1)
-> Bitmap Index Scan on external_id_idx (cost=0.00..4.18 rows=4 width=0) (actual time=1.959..1.959 rows=1 loops=1)
Index Cond: ((external_id)::text = '1'::text)
-> Bitmap Index Scan on children_pkey (cost=0.00..4.16 rows=1 width=0) (actual time=1.178..1.178 rows=1 loops=1)
Index Cond: ((id)::text = '1'::text)
-> Materialize (cost=0.00..20.14 rows=4 width=72) (actual time=0.011..0.012 rows=1 loops=1)
-> Seq Scan on parents p1_0 (cost=0.00..20.12 rows=4 width=72) (actual time=0.008..0.008 rows=1 loops=1)
Filter: ((some_filter)::text = 'some-value'::text)
The same HQL query executed with Hibernate 6.5.1 results with following SQL query:
select
p1_0.id,
c1_0.id,
c1_0.external_id,
p1_0.some_filter
from
parents p1_0
left join
children c1_0
on c1_0.id=p1_0.child_id
where
(
c1_0.external_id=?
or p1_0.child_id=?
)
and p1_0.some_filter=?
The change is in OR condition - c1_0.id=?
became p1_0.child_id=?
which resulted in following query plan:
Hash Right Join (cost=20.18..42.34 rows=1 width=104) (actual time=0.020..0.024 rows=1 loops=1)
Hash Cond: ((c1_0.id)::text = (p1_0.child_id)::text)
Filter: (((c1_0.external_id)::text = '1'::text) OR ((p1_0.child_id)::text = '1'::text))
-> Seq Scan on children c1_0 (cost=0.00..18.80 rows=880 width=64) (actual time=0.002..0.003 rows=1 loops=1)
-> Hash (cost=20.12..20.12 rows=4 width=72) (actual time=0.011..0.011 rows=1 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Seq Scan on parents p1_0 (cost=0.00..20.12 rows=4 width=72) (actual time=0.008..0.008 rows=1 loops=1)
Filter: ((some_filter)::text = 'some-value'::text)
Where index scan on children
table turned to sequence scan.
Provided example is rather trivial but in my production use case with multiple joins and millions of rows this becomes not very performant.
I have prepared 2 PRs that compare which table was used for the same query:
6.5.0 - Bi-directional query reproducer v.6.5.0 by kamil-pociot · Pull Request #455 · hibernate/hibernate-test-case-templates · GitHub
6.5.1 - Bi-directional query reproducer v.6.5.1 by kamil-pociot · Pull Request #456 · hibernate/hibernate-test-case-templates · GitHub
The same behaviour is present in v 6.6.x
I’d appreciate a feedback if this is intended behaviour.
Best,
Kamil