Query execution extremely slow after upgrading to Hibernate ORM 6.2

I have situation that I can’t explain and can’t fix.
My roject is created using Java 18, Spring Boot 3.0.5, Hibernate ORM 6.1.7, Hibernate Search 6.1.8 with Lucene 8.11.1 embedded with Hibernate Search and the database is MySQL 8.

Few days ago I tried to switch to Hibernate ORM 6.2.1.
There was few minor things to adjust and the project was able to start.
Almost everything works fine except one simple search that takes about 50 ms with Hibernate 6.1.7 and 22000 ms with Hibernate 6.2.1:

final SearchScope<T> scope  = searchSession.scope( Article.class );
final SearchQuery<T> q =
	searchSession
		.search(scope)
		.where(searchPredicate)
		.sort(sort)
		.toQuery();
log.debug( q.toString() );
final org.hibernate.search.engine.search.query.SearchResult<T> searchResult =
    q.fetch( 0, 1 );
query = +(+*:* +(textId:sr_rs__zdravo__tapir_0_33_l)^50.0), 
sort =
  <custom:"availability_numeric": org.hibernate.search.backend.lucene.types.sort.comparatorsource.impl.LuceneNumericFieldComparatorSource@67e887b1>,
  <custom:"importance_numeric": org.hibernate.search.backend.lucene.types.sort.comparatorsource.impl.LuceneNumericFieldComparatorSource@15677589>!,
   <custom:"hasPhotos": org.hibernate.search.backend.lucene.types.sort.comparatorsource.impl.LuceneNumericFieldComparatorSource@48504390>!,
  <custom:"completenessLevel": org.hibernate.search.backend.lucene.types.sort.comparatorsource.impl.LuceneNumericFieldComparatorSource@2bdc3f1>!,
  <custom:"sortableDate": org.hibernate.search.backend.lucene.types.sort.comparatorsource.impl.LuceneTextFieldComparatorSource@214b1918>!,
  <score>

After switching versions I did update project by Eclipse and full index rebuild.
Also there is always the same entity used to test and the same steps.
Tried to toggle versions at least 5 times and every time there was project update and full index rebuild.

Is there anyone having similar issues?

Ok, here’s more info.
Original post is related to Windows development computer.
Configuration:

  • Windows 11 Pro 22H2 OS build 22621.1555 Windows Feature Experience Pack 1000.22640.1000.0
  • Eclipse STS 4.18.0
  • Java version “18.0.2.1” 2022-08-18 Java HotSpot™ 64-Bit Server VM (build 18.0.2.1+1-1, mixed mode, sharing)
  • MySQL 8.0.16 for Win64 on x86_64
  • Tomcat 10.1.7

Application start is made by “Debug as Spring Boot App” from Eclipse STS.

After deploying to my Ubuntu test server, it works a little bit faster, but still not as expected.
Configuration:

  • Ubuntu 22.04.2 LTS (Jammy Jellyfish)
  • openjdk version “18.0.2-ea” 2022-07-19
  • Apache Tomcat/10.0.27
  • mysql Ver 8.0.32-0ubuntu0.22.04.2 for Linux on x86_64 ((Ubuntu))

Further testing will be done and I’ll try to put here as much info as I can get.

Hey,

I I understand correctly, you switched the Hibernate ORM version, not the Hibernate Search version? If so, the problem most likely comes from entity loading, so the Lucene query/sort are irrelevant.

EDIT: I took the liberty of updating the title of your post to clarify this is most likely about the ORM upgrade, not about Lucene.

I’d suggest you debug the SQL being executed by setting hibernate.show_sql = true and hibernate.format_sql = true. Also, take advantage of any tool MySQL provides to investigate slow queries (I’m not familiar with MySQL, I cannot help there unfortunately).

Also, make sure you followed the migration guides. I tried to sum up the major changes in this guide for Quarkus, but the “official” guides should be more complete:

In particular:

  • make sure you configured the dialect without the version in its name: org.hibernate.dialect.MySQLDialect and not e.g. org.hibernate.dialect.MySQL8Dialect. ORM 6 detects versions automatically.
  • be aware there were changes to how ID generation works in ORM 6; I don’t think MySQL is affected, but you’d better check that schema validation still passes.

If you can’t find the culprit, we’ll need a small reproducer project to investigate.

You’re welcome!

Hi,

Yes, you’re right. Hibernate ORM is the one. Thank you for your time and effort!
Later I switched table column from enum to varchar to be sure and did more testing which is not documented here yet because it was done at trace level and there were tons of logs which is not so useful as I expected.

The next testing will be done without trace, but with showing formatted SQL turned on. I’ll get back here with results.


The only thing that looks different than suggested by migration guide is ID. I was using Hibernate 6.0 for long time and @IdGeneratorType is suggested while migrating from 5 to 6 which makes me think it shouldn’t be the problem:

@GenericField(
    projectable = Projectable.NO,
    searchable = Searchable.YES,
    sortable = Sortable.NO
)
@GeneratedValue(
    strategy = GenerationType.SEQUENCE,
    generator = "sequence-generator"
)
@SequenceGenerator(
    name           = "sequence-generator",
    sequenceName   = "hibernate_sequence",
    allocationSize = 50
)
@Id
private Integer id;

Hope this one also should not be the problem:

public class CustomMySQLDialect extends MySQLDialect {

	private static final int MAJOR = 8;
	private static final int MINOR = 0;

	public CustomMySQLDialect() {
		super(DatabaseVersion.make(Integer.valueOf(MAJOR), Integer.valueOf(MINOR)));
	}

	@Override
	public String getTableTypeString() {
		return "engine=innodb default charset=utf8mb4 collate utf8mb4_unicode_ci";
	}

}

Ok, I’m shocked now. After changing log level to info but with show-sql enabled, here’s the comparison:

Article entity persist with 6.1.7:

  • 469 select statements
  • 22 insert statements
  • 4 update statements
  • 3 delete statements
  • which makes total of 498 SQL statements

Article entity persist with 6.2.1:

  • 29363 select statements
  • 44 insert statements
  • 12550 update statements
  • 6 delete statements
  • which makes total of 41963 SQL statements

NOTE: Also fetching a page of articles is incuded here because my admin interface reloads data grid used to manage articles. But this refresh is fast as it was before which can be checked if one does not persist any article but only press Ctrl+F5 to reload the whole JavaScript frontend including all data reload.

Log files are available here:
https://horvoje.net/tmp/6.1.7.zip (22 023 bytes)
https://horvoje.net/tmp/6.2.1.zip (1 430 623 bytes)

The only thing changed in the whole project is Hibernate ORM version in pom.xml.

Is there anything else I should exclude and add as a separate dependency?

Code snippet from pom.xml:

<properties>
     <java.version>18</java.version>
<!-- <hibernate-core.version>6.1.7.Final</hibernate-core.version> -->
     <hibernate-core.version>6.2.1.Final</hibernate-core.version>
     <org.hibernate.search.version>6.1.8.Final</org.hibernate.search.version>
</properties>

<dependency>
	<groupId>org.springframework.boot</groupId>
	<artifactId>spring-boot-starter-data-jpa</artifactId>
	<exclusions>
		<exclusion>
			<groupId>org.hibernate.orm</groupId>
			<artifactId>hibernate-core</artifactId>
		</exclusion>
	</exclusions>
</dependency>

<dependency>
	<groupId>org.hibernate.orm</groupId>
	<artifactId>hibernate-core</artifactId>
	<version>${hibernate-core.version}</version>
</dependency>

<dependency>
	<groupId>org.hibernate.search</groupId>
	<artifactId>hibernate-search-mapper-orm-orm6</artifactId>
	<version>${org.hibernate.search.version}</version>
	<exclusions>
		<exclusion>
			<groupId>org.hibernate.orm</groupId>
			<artifactId>hibernate-core</artifactId>
		</exclusion>
	</exclusions>
</dependency>

<dependency>
	<groupId>org.hibernate.search</groupId>
	<artifactId>hibernate-search-backend-lucene</artifactId>
	<version>${org.hibernate.search.version}</version>
</dependency>

At this moment it looks to me as @Transactional(readOnly = true) is not accepted for my services and each entity has been updated after finishing a session rather than just purging it from managed entities storage or cache or whatever Hibernate calls its managed entity storage.

The next will be to remove readOnly = true with 6.1.7 and comapre amount of SQL update statements.

EDIT: No, that’s not the case. The amount of SQL statements is equal to the case when @Transactional is set to readOnly = true for non-writing methods.

Next thing I did is moved all development environment from Windows to Ubuntu 22.04 - Java, Eclipse, Tomcat, nginx, MySQL, maven, git, copy of production database, copy of production photos…

The behaviour is the same on both Windows and Ubuntu.

If needed I can provide VirtualBox machine with Ubuntu and a project all set up in Eclipse for furhter investigation.

Hey,

I’ll move this to the ORM category as this is clearly not caused by Hibernate Search (from what I can see even updates are affected).

The stats you’re giving are very suspicious, in particular I wouldn’t expect to see delete statements when persisting an entity. So I suspect that when you say “entity persist” there’s a lot more going on, and that is what we need to reproduce.

I recommend you provide a simple reproducer based on hibernate-test-case-templates/orm/hibernate-orm-6 at main · hibernate/hibernate-test-case-templates · GitHub, with a simplified version of your model and a scenario that triggers more queries than in ORM 5. Then someone from the ORM team can investigate.

Ok, you’re right. My “persist” is a method that also handles photos attached to the Article and Categories too. Categories are in a separate table and delete is related with removal of old Catgories right before inserting new Categories.

I’ll try to see what to do with a reproducer.

P.S.
Every day I learn something new :slightly_smiling_face:
At the moment .fetch() is issued, all previous modifications in the same transaction are executed.
For example, this method mentioned (“persist”) does some deletions and inserts in related linking tables. At some point I have to search Article class to check if there is already Article with same properties and at the moment .fetch() is issued, Hibernate runs those deletions and inserts, not a moment before.

When Article entity is going to be persisted, there are few actions before, for example delete all entries from linking table with categories (Set<Category> → I don’t do it through SQL, Hibernate does it automatically). Also the same thing goes with with Tag entities attached as Set<Tag> to Article and Negativity entities: Set<Negativity>. That’s the reason of few deletes and inserts before main update statement on Article.

Here you can see extremely short log of single search:
Hibernate ORM 6.1.7: https://horvoje.net/tmp/6.1.7.log
Hibernate ORM 6.2.1: https://horvoje.net/tmp/6.2.1.log

Some CR/LFs are added in order to make groups of SQL statements.

If you open log 6.1.7, you’ll see few hibernate deletes and updates, then few inserts (linking tables) and finally select statements that load linked entities (Category, City, Provider…)

But when you open log for 6.2.1, there are update statement for each Category entity (which is not changed). I guess those categories were loaded by Hibernate because they’re linked to the Article and just because Article persist method has transaction mode readOnly set to true (because I have to persist an Article, of course) - those categories are also persisted meaning updated. But with 6.1.7 that was not the case.
In log for 6.2.1 there are also tons of select statements for linked table price_info which is also not the case with 6.1.7.

Just to confirm: Java code is the same including all annotations, entity mappings, everytning. The only thing changed is version number in pom.xml and Maven->Project Update is done after this change.

Sorry for so many comments!

Oh yes, just to mention reproducer project - to get this behaviour I would have to copy almos the whole project in reproducer project. So my alternative solution is to provide a working VirtualBox Ubuntu machine with a project set up including the Java, Eclipse, Tomcat, nginx, self signed certificate, Firefox browser, copy of production database, copy of photos from production server - I mean everyting.
And also breakpoints are set on places of interest for this specific situation.

Please share the mapping of Article#category.

I guess those categories were loaded by Hibernate because they’re linked to the Article and just because Article persist method has transaction mode readOnly set to true (because I have to persist an Article , of course) - those categories are also persisted meaning updated. But with 6.1.7 that was not the case.

I don’t know what the effect of this readOnly flag is for Hibernate. Could you please tell us what it does in terms of Hibernate? Does it set the query hint org.hibernate.readOnly to true? Can you check what EntityEntry.getStatus() is for those categories?

Hi!

@Transactional (org.springframework.transaction.annotation.Transactional) is used at service level and wraps each method in a transaction. Once transaction is started, next method which is also transactional just continues to use the same transaction. During the time with Hibernate I found out that setting readOnly = true enables Hibernate to avoid updates to entities at the very end of transaction. When transaction is not read only, Hibernate issues updates for entities that were loaded as managed entities no matter if you persisted (issued repository method save()) or not. At least that was the case with 6.1.7 and previous versions. Not with 6.2.1 there are some additional updates but I can’t be sure is it related to transactions or something else.

Now going to investigate how to get EntityEntry.getStatus().


Article.java

@Entity @Indexed 
class Article {

	// ...

	@ManyToMany(fetch = FetchType.LAZY)
	@FullTextField(
		projectable = Projectable.NO, searchable = Searchable.YES,
		valueBridge = @ValueBridgeRef(type = CategoriesBridge.class)
	)
	private Set<Category> categories;

	// ...

}

Category.java

/**
 * Only 1 level of parent / child is allowed.
 * If entity has a parent, entity is a child and child list is ignored.
 * If there is no parent, entity is a parent and its parent property
 * is ignored and only child list is used.
 */
@Entity @Indexed 
class Category {

	// ...

	@JsonManagedReference
	@ManyToOne(fetch=FetchType.LAZY)
	private Category parent;

	// child categories
	@OneToMany(mappedBy="parent", fetch=FetchType.LAZY)
	private List<Category> categories;

	// ...

}

With these mappings, the Article#categories collection should not be initialized and hence also not become managed, unless you somehow initialize this yourself. Can you please check how the collection becomes initialized?

Hi!

To make it more transparent and understandable, those are steps when user input comes to Spring controller:

  1. Receive DTO created from JSON sent from frontend.

2a) Check if ID is present in DTO and if it is, we’re having new item which means unmanaged entity is created by new Article()

2b) If we have ID, it’s just an update of existing entity and then fetch it from repository which will be managed item then

  1. Copy appropriate fields from DTO to entity (name, description etc)

  2. Now this step is #"$: generate text ID to use as URL (for example if your item is named “Chocolate cookies 400 g”, then text ID will be “chocolate-cookies-400-g”) and then use this generated text ID with Hibernate search to check is there already item with the same text ID → in that case append “-1” and check again and if found, remove “-1” and append “-2” etc.
    To check existence of the same text ID I use Hibernate SearchQuery and this query is the problem. At the very moment I run org.hibernate.search.engine.search.query.SearchQuery<Article>.fetch() all those SQL statements are issued. The query is pretty simple and it takes few ms with Hibernate 6.1.7:
    +(+*:* +(textId:hr_hr__zdravo__tapir_0_33_l)^50.0)

  3. After all those steps, the last step is to persist the main entity and this goes well without problems.


Here are data for EntityEntry.getStatus():


Method to dump status values:

void hibernateDebug(final String msg) {
    System.out.println();
    System.out.println("### " + msg + " ###");

    final org.hibernate.engine.spi.SessionImplementor session =
        this.entityManager.unwrap(
            org.hibernate.engine.spi.SessionImplementor.class
        );

    final org.hibernate.engine.spi.PersistenceContext pc =
        session.getPersistenceContext();

    final Map.Entry<Object, org.hibernate.engine.spi.EntityEntry>[] entityEntries =
        pc.reentrantSafeEntityEntries();

    for (final Map.Entry<Object, org.hibernate.engine.spi.EntityEntry> e
                 : entityEntries) {
        final Object key = e.getKey();
        if (key instanceof Article a) {
            System.out.println(
                "### ARTICLE -> STATUS = " + e.getValue().getStatus()
            );
        }
        else {
            System.out.println(
                "### OBJECT: " + key.getClass().getSimpleName() + "; " +
                "STATUS = " + e.getValue().getStatus()
            );
        }
    }
    System.out.println();
}


With Hibernate ORM 6.1.7

### Right after fetch from repository ###
### OBJECT: User; STATUS = MANAGED
### ARTICLE -> STATUS = MANAGED

### Right after persisting to repository ###
### OBJECT: User; STATUS = MANAGED
### ARTICLE -> STATUS = MANAGED
### OBJECT: Category; STATUS = MANAGED
### OBJECT: Category; STATUS = MANAGED
### OBJECT: Category; STATUS = MANAGED
### OBJECT: Category; STATUS = MANAGED
### OBJECT: Category; STATUS = MANAGED
### OBJECT: Category; STATUS = MANAGED
### OBJECT: Manufacturer; STATUS = MANAGED
### OBJECT: Negativity; STATUS = MANAGED
### OBJECT: Negativity; STATUS = MANAGED
### OBJECT: Negativity; STATUS = MANAGED
### OBJECT: Negativity; STATUS = MANAGED
### OBJECT: Photo; STATUS = MANAGED
### OBJECT: Provider; STATUS = MANAGED
### OBJECT: PriceInfo; STATUS = MANAGED
### OBJECT: Provider; STATUS = MANAGED
### OBJECT: PriceInfo; STATUS = MANAGED
### OBJECT: Tag; STATUS = MANAGED
### OBJECT: Tag; STATUS = MANAGED
### OBJECT: Tag; STATUS = MANAGED
### OBJECT: Tag; STATUS = MANAGED
### OBJECT: Tag; STATUS = MANAGED
### OBJECT: Tag; STATUS = MANAGED
### OBJECT: Tag; STATUS = MANAGED
### OBJECT: Tag; STATUS = MANAGED
### OBJECT: Tag; STATUS = MANAGED
### OBJECT: Tag; STATUS = MANAGED
### OBJECT: Tag; STATUS = MANAGED
### OBJECT: Language; STATUS = MANAGED
### OBJECT: Language; STATUS = MANAGED
### OBJECT: Language; STATUS = MANAGED
### OBJECT: Category; STATUS = MANAGED
### OBJECT: City; STATUS = MANAGED
### OBJECT: Country; STATUS = MANAGED
### OBJECT: Provider; STATUS = MANAGED
### OBJECT: ProviderType; STATUS = MANAGED


With Hibernate ORM 6.2.1

### Right after fetch from repository ###
### OBJECT: User; STATUS = MANAGED
### ARTICLE -> STATUS = MANAGED

Not being able to post it here:
An error occurred: Body is limited to 32000 characters; you entered 404128.

HERE: https://horvoje.net/tmp/statuses.txt

Does it matter that it works fine with 6.1.7?
I mean the code is exactly the same, didn’t change anything except pom.xml.

When I look at all those managed items with 6.2.1, it seem the whole repository is loaded somehow recursively - 10747 items.

Ok, I “solved” my problem but still did not find a root cause.
It was about @IndexingDependency. Just for the test purposes I removed all @IndexingDependency and corresponding @KeywordField or @GenericField annotations from Article entity and I got the same behaviour as with Hibernate ORM 6.1.7 without hundreds of select statements.

Then I added @IndexingDependency again and problem appeared again.

The weird thing is that I have reindexOnUpdate = ReindexOnUpdate.NO on all @IndexingDependency annotations which made me think Hibernate Search won’t be reading and indexing anything on its own as I already have a service to reindex affected Article entities.

Typical example:

@GenericField(
    name = "availability",
    projectable = Projectable.NO,
    searchable = Searchable.YES,
    sortable = Sortable.NO
)
@GenericField(
    name = "availability_numeric",
    projectable = Projectable.NO,
    searchable = Searchable.YES,
    sortable = Sortable.YES,
    valueBridge = @ValueBridgeRef( type = AvailabilityToNumericBridge.class )
)
@IndexingDependency(derivedFrom = {
    @ObjectPath({ @PropertyValue(propertyName = "lifespanStatus") }),
    @ObjectPath({ @PropertyValue(propertyName = "manufacturer") }),
    @ObjectPath({
        @PropertyValue(propertyName = "priceInfos"),
        @PropertyValue(propertyName = "provider") 
    })
}, reindexOnUpdate = ReindexOnUpdate.NO)
@Transient
private Availability getAvailability() {
	return ArticleUtil.getAvailability(this);
}

To get more info about the issue - I created simple pass through RoutingBridge for each of my entities. The implementation has logging of entity being indexed and routes.addRoute(); to pass through unconditonally and make this bridge transparent.

Then I created new Article from scratch through admin interface meaning no other entities attached except one Category which is mandatory for my model. So far everything was fine.

Then I attached a Manufacturer and at the moment when my service layer issued fetch() method of org.hibernate.search.engine.search.query.SearchQuery, all Article entities that has the same Manufacturer ended up being indexed although Manufacturer entity wasn’t changed.

Then I switched to 6.1.7 and things started to act normal meaning no reindexing was done for additional entities except the entity being updated from user interface.

Is it possible that @IndexedEmbedded makes Manufacturer “dirty” when Article is being updated? That might be the cause why Manufacturer is being reindexed and then, of course, also all Article entities having that Manufacturer should be reindexed.

EDIT: Yes, that’s the case. Now I removed @IndexedEmbedded and all things related to it and reindexing came back to normal meaning no more additional entities indexed.


Property of type Manufacturer inside Article class:

@ManyToOne(fetch = FetchType.LAZY)
@IndexedEmbedded
@AssociationInverseSide(
    inversePath = @ObjectPath(@PropertyValue(propertyName = "articles"))
)
private Manufacturer manufacturer;

Back reference inside Manufacturer class:

@OneToMany(fetch = FetchType.LAZY)
@JoinColumn(name = "manufacturer_id")
private List<Article> articles;

Any idea how the @IndexedEmbedded annotation might play a role here @yrodiere?

We’re getting very far from the initial statement “queries are slow”, though :slight_smile: So it’s indexing in the end.

@IndexedEmbedded means an entity is embedded into another at indexing time, so among the many events sent by Hibernate ORM, we will consider more of them, and we may react to some of them by triggering reindexing, which will trigger more loading.

So it’s possible that we were reindexing less in 6.1 than in 6.2 because 6.2 issues more entity change events, or issues different entity change events. Though whether these additional events and this additional reindexing is right or wrong, I cannot say.

And of course there’s the possibility that the additional loading is not at all related to additional reindexing, but something else (some incorrect use of events that triggers lazy loading for some reason, …).

Rreally, this seems to happen only in your very specific scenario, @horvoje . We run integration tests with both ORM 6.1 and ORM 6.2 and I don’t see the kind of problems you’re experiencing. So in order to help, we’ll need a reproducer. Maybe now that you’ve identified a way to make the problem disappear, you can copy your model to an empty project and remove anything that’s not relevant?