Concurrency issues since upgrade to 5.4.24

Okay, seems like that we have found the root cause for this issue…

The SQLExceptions in our stack trace are caused by incorrect batchsizes for EntityLoader#Builder. This results in incorrect number of parameters being generated into prepared SQL statements for loading entities. Sometimes we face too many parameter placeholders like mentioned above, sometimes too few.

<tldr>EntityLoader$Builder is not thread safe and should not be used concurrently</tldr>

The identical DynamicBatchingEntityLoader (i.e. the same object ID) is used in several threads at the same time.This object contains an EnityLoader$Builder (obviously same object ID as well) which is used to build the EntityLoader. The batch size is set on the builder according to the number of ids that has to be loaded in DynamicBatchingEntityLoader#.load() before the factory method EntityLoader.Builder.byPrimaryKey () is called. So there is a window of vulnerability between setting the batch size and generating the SQL string. If on a different thread the batch size is set to a different value we are in trouble.

It is still not completely clear to us why exactly the identical DymamicBatchingEntityLoader can be used on several threads simultaneously. The solution to the question can probably be found in AbstractEntityPersister.doLoad(). The method getAppropriateLoader() is called there. In our case always the last branch is taken (getLoaderByLockMode()). This does not respect the current session.

The changeset mentioned in our initial message is most likely not responsible for the problems observed. I would rather expect HHH-14271 Lazy initialization of UniqueEntityLoader for most LockMode… · hibernate/hibernate-orm@81d526e · GitHub to be the source. Some caching and lazy initialization for EnitityLoaderBuilders seems to have been introduced there.

The changes for [HHH-14312]([HHH-14312] Padded batch style entity loader ignores entity graph - Hibernate JIRA we initially suspected just made the pre-existing bug much more likely to occurr in our situation. While fixing HHH-14312 a special treatment for batchsize = 1 was eliminated. This leads to a larger number of calls to withBatchSize() and therefore a higher potential for collisions.

Does this sound reasonable?
If no: Could we just be holding it wrong and by this cause getAppropriateLoader() return inappropriate stuff? Would you need more information to understand better what is going wrong?
If yes: How should we proceed? Will you file a bug or should we do that? How likely is a fix for this issue for the next release?