Should I use an application-generated entity identifier or use the database native generator with Hibernate?

Pushkar_Deole · October 6, 2018, 1:06pm

Hi,

I am using the database native sequence generator using following tags in hibernate hbm file:
<id name=“contactId” column=“contact_id”>
<generator class=“native”>
<param name=“sequence”>PimContact_seq</param>
</generator>
</id>

I run an import job to import records in that table. The records are in the order of several thousands to even million. The performance of import is slow so when I use the SQL server Query Store to analyze queries fired on the database, I see that the select next value from the sequence is getting executed as many times as the number of records to be imported plus there are some more supporting queries related to sequence which are getting executed as many times as well.

I guess this is one of the reasons the import job is slow.

Are there any ways to optimize this?
Will using an application generated identifier (and avoiding database sequence generator) improve the performance of import job?

vlad · October 6, 2018, 1:28pm

For SQL Server, the IDENTITY generator will be used by the native id generator as explained in this article.

Switch to SEQUENCE generator and use the pooled optimizer.

Check out this article about how to control the Persistence Context size as well as the transaction length as well, which can help you speed up your batch processing task.

Pushkar_Deole · October 6, 2018, 2:18pm

Thanks Vlad… regarding my 2nd question: will generating an identifier through application will also help? I guess that will be the most optimized alternative.

vlad · October 6, 2018, 2:36pm

That’s a bad idea because:

the DB sequence generator can be used concurrently by both your application threads and the DB admin as well. If you use an app-generated sequence, the DBA will not know what’s the next available ID.
the DB sequence uses lightweight locks, so generating a new sequence value is very fast and the pooled optimizer speeds that up even more.
if you use a UUID, it will require more space, hence the indexes will take more memory and that will affect the clustered index as well, which is the typical way of organizing a table in SQL Server.

Therefore, use a DB sequence as it’s much better from a performance perspective.

Pushkar_Deole · October 6, 2018, 3:42pm

Is the DB SEQUENCE generator supported by every DB?
at least, the databases supported by the product are SQL Server, Oracle, PostgreSQL and recently MySQL support is being added in current release.

vlad · October 6, 2018, 5:33pm

Sequence is supported by all DBs but MySQL. For MySQL, you can define the Ids via orm.xml as explained in this article.

Pushkar_Deole · October 8, 2018, 1:21pm

For SQL Server, the IDENTITY generator will be used by the native id generator as explained in [this article ]

Thanks Vlad… by the way, I am confused a bit… the generator class above is ‘native’ however there is a param with name sequence named as PimContact_seq which is getting queried by SQL server before each insert, see below query from SQL server Query Store:

select next value for PimContact_seq as seq

so is this IDENTITY generator or the SEQUENCE generator? From above query, this seems Sequence generator itself.

vlad · October 8, 2018, 2:30pm

The Dialect#getNativeIdentifierGeneratorStrategy looks as follows:

public String getNativeIdentifierGeneratorStrategy() {
	if ( getIdentityColumnSupport().supportsIdentityColumns() ) {
		return "identity";
	}
	else {
		return "sequence";
	}
}

Therefore, it’s not clear why it’s picking up sequence on SQL Server in your case, but you can debug it and find out why.

Pushkar_Deole · October 15, 2018, 4:57pm

Anyway, so essentially it’s the Sequence generator that is getting used. Hence, it must be taking advantage of hibernate JDBC batching already right?

Currently hibernate.jdbc.batch_size is set to 30 but it could be tuned for better performance or 30 is the standard number that all applications should use?

vlad · October 15, 2018, 5:49pm

Anyway, so essentially it’s the Sequence generator that is getting used. Hence, it must be taking advantage of hibernate JDBC batching already right?

It shouldn’t. That’s why you need to debug it and see exactly what identifier it picks, or just log the queries and you will understand.

Currently hibernate.jdbc.batch_size is set to 30 but it could be tuned for better performance or 30 is the standard number that all applications should use?

There’s no standard number. You need to measure your application to find the right size for your use case.

Pushkar_Deole · October 22, 2018, 7:00am

or just log the queries and you will understand.

When you say log the queries, will adding this property to hibernate.cfg suffice?

<property name="show_sql">true</property>

vlad · October 22, 2018, 7:37am

You can do that, but it won’t show you bind parameters or hint you that you are actually using logging.

Better use datasource-proxy, as explained in this article.

Topic		Replies	Views
Performance issue with bulk inserts Hibernate ORM	8	3125	January 25, 2024
Second generator using sequence when you have UUID Hibernate ORM	4	941	November 30, 2023
Hibernate 6 migration: Using different ID generation strategy for a specific database type (SQL Server) Hibernate ORM	3	543	February 11, 2025
Generated value strategy AUTO Hibernate ORM	6	5503	July 22, 2022
Customize Sequence ID Generation Hibernate ORM	1	1184	October 17, 2024

Should I use an application-generated entity identifier or use the database native generator with Hibernate?

Related topics