About the order of SQL execution when flushing to the database


#1

Hello,

the documentation for method performExecutions(EventSource session) in class org.hibernate.event.internal.AbstractFlushingEventListener says

Execute all SQL (and second-level cache updates) in a special order so that foreign-key constraints cannot be violated:

Inserts, in the order they were performed
Updates
Deletion of collection elements
Insertion of collection elements
Deletes, in the order they were performed 

Can you please clarify why this is done like this?

It seems that if the persistence context was flushed in the order that changes to entities happened in application code, the foreign-key constraints would not be violated too (assuming the entity changes in the code are ordered to avoid contraints violations).

Related to this, the Example 370 in the User Guide section 6.5. Flush operation order, here repeated:

Person person = entityManager.find( Person.class, 1L);
entityManager.remove(person);

Person newPerson = new Person( );
newPerson.setId( 2L );
newPerson.setName( "John Doe" );
entityManager.persist( newPerson );

which when flushing generates the SQL:

INSERT INTO Person (name, id)
VALUES ('John Doe', 2L)

DELETE FROM Person WHERE id = 1

does seem that it would work if the deletion was done before the insert, as per the order of the changes in the code.

Thanks
Mário


#3

Can you please clarify why this is done like this?

I asked Gavin about the order and he said that the order was chosen to minimize the chances of constraint violations.

However, we can run an experiment and see what happens when we change the order and move the DELETE action queue before the INSERT. You can do it here. Afterward, just run the tests in hibernate-orm and see what happens.


#4

Has the Hibernate Team ever tested to remove the above order enforcement, and simply flush as per the order of entity changes in the application code, to see what would happen regarding the chance of contraint violations?

All the examples (not many) I saw regarding this topic does seem that code order would work, like in A beginner’s guide to Hibernate flush operation order (of yours) in which if the image was removed first before adding the other, as per the code, there would be no constraint violation.


#5

If I do the DELETe action before INSERT and run all tests, this is what I get:

org.hibernate.userguide.associations.ManyToManyUnidirectionalTest > testRemove FAILED
    javax.persistence.RollbackException at ManyToManyUnidirectionalTest.java:83
        Caused by: javax.persistence.PersistenceException at ManyToManyUnidirectionalTest.java:83
            Caused by: org.hibernate.exception.ConstraintViolationException at ManyToManyUnidirectionalTest.java:83
                Caused by: org.h2.jdbc.JdbcSQLException at ManyToManyUnidirectionalTest.java:83

org.hibernate.userguide.associations.OneToManyUnidirectionalTest > testLifecycle FAILED
    javax.persistence.RollbackException at OneToManyUnidirectionalTest.java:39
        Caused by: javax.persistence.PersistenceException at OneToManyUnidirectionalTest.java:39
            Caused by: org.hibernate.exception.ConstraintViolationException at OneToManyUnidirectionalTest.java:39
                Caused by: org.h2.jdbc.JdbcSQLException at OneToManyUnidirectionalTest.java:39

org.hibernate.userguide.pc.PersistenceContextTest > test FAILED
    javax.persistence.OptimisticLockException at PersistenceContextTest.java:113
        Caused by: org.hibernate.StaleStateException at PersistenceContextTest.java:113

However, the real problem is not that Hibernate does not issue the DELETE prior to the INSERT. In reality, the problem is that the user does not issue the UPDATE instead. If you stick to my advice in that article, you should never bump into this problem.


#6

Vlad, If you do what? What did you do before preparing to run those tests? Did you simply changed the order of DELETEs and INSERTs in the action queue in here?

If yes, that is not what I suggested, which was to let the flush order proceed as per the order of the changes in the code, which implies to enforce no action queue order (which is different from imposing a different order by rearranging the order of the entries in the EXECUTABLE_LISTS_MAP in the above link).

From the code example in your blog entry, it seems obvious that first removing the image in the collection, then adding it to the collection, like this

product.removeImage(sideImage);
product.addImage(backImage);

the corresponding SQL statements would never generate a constraint violation at database level.

Could you run the same tests but obeying the order of changes as these happen in the test’s code?

Update: note that in your blog, when you give the following solution

product.removeImage(sideImage);
entityManager.flush();
product.addImage(backImage);

you are in fact bypassing the internally enforced order by restoring code changes order when calling flush().

Does Hibernate enforce its order because of some kind of optimization I am not aware of?


#7

That cannot be done because Hibernate supports cascading, secondary tables, joined inheritance. Not to mention statement ordering for batching.

It’s an inherent way of using an ORM framework.

Now, what exactly is the real use case you think Hibernate does not handle properly with its action ordering?

I now realized that in my blog I didn’t suggest to use UPDATE, instead of flushing the PC which is just a hack. In my book, I did that but forgot to update the blog post.


#8

What cannot be done? Make Hibernate flush as per the order of changes in the code?

Now, what exactly is the real use case you think Hibernate does not handle properly with its action ordering?

I am not thinking about a specific use case, but the cases that seems that Hibernate’s default flush order cannot handle properly are all cases where a manual flush, or code rearrangement as to use UPDATE, is needed, like in your blog’s example.

I am just trying to understand why Hibernate does it the way it does, and it seems that it is because it “supports cascading, secondary tables, joined inheritance” (and perhaps others).

It is not clear to me how these hinder “code order flushing” in an ORM, which is a very strong claim, but I accept that you know something I do not. If you could give evidence supporting this claim, perhaps with an example involving cascading, it would be a great aid to my understanding of this subject.


#9

Yes. Hibernate cannot be changed as you suggested. But that is not a bug, it’s a feature.

What you describe is how an ActiveRecord framework should work, not an ORM framework.


#10

What you describe is how an ActiveRecord framework should work, not an ORM framework.

It might be, but I cannot think of any evidence that “code order flushing” in an ORM is not possible. For example, calling entityManager.flush() after every change would do it, or not?

An example would clarify my understanding.


#11

If you flush after every operation, there’s no reason to use an ORM. In fact, a manual flush is a code smell when using Hibernate.

The reason it works like that is because, otherwise, Hibernate would not have a chance to batch statements since the user could interleave persist and remove operations. Also, the order is because how parent side unidirectional do the INSERTS first and then the UPDATE since the child side does not know there’s a FK on their side. Probably there are other use cases as well.


#12

Ok, now we are getting there …

If you flush after every operation, there’s no reason to use an ORM. In fact, a manual flush is a code smell when using Hibernate.

It is definitely a code smell if it is abused or unnecessary; otherwise, it is needed to impose “code change order” during flushing, when Hibernate’s default behavior does not work for a specific use case, like your product and image example above.

The reason it works like that is because, otherwise, Hibernate would not have a chance to batch statements since the user could interleave persist and remove operations

This shows why I was wondering before if it were not optimizations behind the reason for Hibernate’s enforced flush order. In fact, this order is necessary to aggregate similar operations and support JDBC batches for performance reasons, otherwise these would not be effective, as we can read in your blog entry How to batch INSERT and UPDATE statements with Hibernate:

A JDBC batch can target one table only, so every new DML statement targeting a different table ends up the current batch and initiates a new one. Mixing different table statements is therefore undesirable when using SQL batch processing.

And the next:

Also, the order is because how parent side unidirectional do the INSERTS first and then the UPDATE since the child side does not know there’s a FK on their side. Probably there are other use cases as well.

This is perhaps due to the way Hibernate maps objects/changes to the database. The use case in the sentence is not clear to me, so I do not know if there is another way to map changes to the database that would relax the need for enforcing a flush order different from the code order.


#13

It is definitely a code smell if it is abused or unnecessary; otherwise, it is needed to impose “code change order” during flushing, when Hibernate’s default behavior does not work for a specific use case, like your product and image example above.

Nope. It’s a code smell. I updated the entire article to make it clear.

This is perhaps due to the way Hibernate maps objects/changes to the database. The use case in the sentence is not clear to me, so I do not know if there is another way to map changes to the database that would relax the need for enforcing a flush order different from the code order.

If you are really interested in this topic, you could try to investigate whether your suggestion can be done while still preserving batch statement ordering, cascading, unidirectional association behavior, etc. That’s the beauty of open-source software development.


#14

That was a substantial update in your article.

You suggest to remove avoid deleting and adding a new instance of an object (a DELETE and an INSERT) sequence like this

entityManager.remove(post);
...
entityManager.flush();
entityManager.persist(newPost);

with an UPDATE to the existing instance like this

post.setTitle(...)

This solves the constraint violation issue by way of a workaround that avoids Hibernate’s flush order. That is, Hibernate’s flush order did not avoid the constraint violation; instead it was the code refactoring you did that avoided this violation. This means that Hibernate’s flush order could not solve this case of constraint violation. Perhaps it can only minimize the chance of such violations, like, as suggested above, that are related to managing cascading, second tables, etc (yet I have yet to see evidence of this).

(Note how the refactored code degenerates in change order flush.)

This example reinforces my view that Hibernate’s flush order is mostly there (only?) because of performance optimizations regarding the way batch processing is done in JDBC. Please, do not see this as a critique to anything, as Hibernate’s flush order is perhaps the only way to leverage JDBC’s batch processing feature.

Nope. It’s a code smell. …

In your example, I would agree that it is a code smell. In fact, I think the developer did not really want to delete a post and insert another, which was how she coded initially. What she really wanted, was to change the title of an existing post, but she did not know it …

Now, suppose a situation when deleting and inserting a post is really the way to go. Suppose that posts have many fields, including the slug, and the developer really wants a new post with all fields different, but the slug. Imagine that the new post is read from a file, like a JSON feed, so that a single line of Java code would bring it into existence in memory. In this case, removing the existing post, doing a manual flush, and persisting the new post would be much easier and more elegant than typing a long and cumbersome sequence of existingPost.setX(...) calls for every field X in post but the slug field. If not, suppose that the existing and the new post have different associations to other objects, in which case simply updating the existing post to become the new one is not possible… This example shows that your suggestion is either cumbersome or impossible, in more complex cases than the example you gave.

So, my conclusion is that, manually calling flush() to force code change ordering is never a code smell: it is really needed to avoid constraint violations, in real delete and insert cases, as a workaround to restore code change order when Hibernate’s default flush order cannot avoid a constraint violation. What really is a code smell, is doing a delete and insert when what one wants to do is an update.

If you are really interested in this topic, you could try to investigate whether your suggestion can be done while still preserving batch statement ordering …

I don’t think it can be done because of batch statement ordering. And I do not know if this is done as it is in JDBC because of a strict JDBC design decision, or because of the way most databases support it.

A few other comments to your blog article:

  • there is a typo in “Doman Model”;

  • add “, but a different title:” to the end of “… instead with the same slug attribute”.


#15

There’s nothing elegant about deleting a record to insert it back.

Just like clearing a collection to add the items coming in the web request is also a bad design which requires a manual flush, this delete and insert is also a problem.

That example with having to call a bunch of setters is the right way to implement that use case and it’s not cumbersome at all. In fact, the unit test will enforce the logic so nothing to worry about.


#16

That example with having to call a bunch of setters is the right way to implement that use case and it’s not cumbersome at all. In fact, the unit test will enforce the logic so nothing to worry about.

You simply did not read carefully what I wrote: doing as you say won’t work well, or is even impossible, if the existing and new posts have different links to other objects. In this case, a delete and insert is more recommended than updating the fields and the associations of the existing post.

In addition, there are other use cases where removing links and resetting fields trigger events to observers, which may not work well, or at all, as per your suggestion to always do an update instead of a real delete and insert is needed.

Finally, I still have not seen any evidence of the need for Hibernate’s flush order but to comply with the requirements of JDBC batch updates. Regarding this, I have not time to take your challenge to devise something alternative that would avoid enforcing a code style on developers to avoid Hibernate’s flush order issues. But I would try this idea:

  • throw away Hibernate’s flush order in favor of code changes order;
  • then, depending on the order changes are done in the code, issue a JDBC batch update or not.

Then, with this, developers would only have to know that their code could run faster if they aggregate similar changes in their transaction’s change sequence. It’s an idea.

Regarding cumbersomeness and elegance it depends on the eyes of the beholder :slight_smile:

I also suggest that you enrich your blog post with the “cumbersome” example I gave and explanation, and let your readers decide their choices when coding.

Thanks for this discussion!


#17

You simply did not read carefully what I wrote: doing as you say won’t work well, or is even impossible, if the existing and new posts have different links to other objects. In this case, a delete and insert is more recommended than updating the fields and the associations of the existing post.

I still have to find a real-life example where the manual flush is the right way to do it.

In addition, there are other use cases where removing links and resetting fields trigger events to observers, which may not work well, or at all, as per your suggestion to always do an update instead of a real delete and insert is needed.

You should add an example to illustrate what you mean. Just add some entities and the data access code which proves what you say. I’ll then come up with an alternative that’s more efficient and does not require the manual flush.

Finally, I still have not seen any evidence of the need for Hibernate’s flush order but to comply with the requirements of JDBC batch updates.

It’s actually more than that. It’s how all ORM work because, unlike ActiveRecord, an ORM executes statements on your behalf based on entity state changes.

Then, with this, developers would only have to know that their code could run faster if they aggregate similar changes in their transaction’s change sequence. It’s an idea.

I have no idea what you’re talking about.

I also suggest that you enrich your blog post with the “cumbersome” example I gave and explanation, and let your readers decide their choices when coding.

There’s no need. It was you who said that doing the update is cumbersome. It is you who needs to write a blog post to prove that manual flush is needed for the use case you described.

Thanks for this discussion!

You’re welcome.


#18

Well, Hibernate’s docs about manual flushing gives this statement (which I don’t know what they mean by “multi-request logical transactions”):

This mode is useful when using multi-request logical transactions, and only the last request should flush the persistence context.

Note that I am not complaining about Hibernate’s flush order as I understand it is a way to support batch processing in JDBC. I just don’t agree with you that manual calling flush() is always a bad smell.

You should add an example to illustrate what you mean. …

I think I did this already: in your recommendation, developers have to remove all links on the existing object, then reset its fields and links to match the desired state for the new object. Pretty elegant and nice to type code indeed … What else do you need as an example?

Finally, I am still waiting for evidence that Hibernate’s flush order is needed for anything else than for performance reasons via JDBC batch updates …

Cheers :slight_smile:


#19

This mode is useful when using multi-request logical transactions, and only the last request should flush the persistence context.

That’s related to FlushType.MANUAL, not a session.flush() added as a hack to a delete-than-insert anti-pattern. FlushType.MANUAL is to guarantee that no AUTO flush is executed. It does not mandate that you have to call flush. The last read-write transaction can use FlushType.AUTO. So, even in multi-request logical transactions you don’t need to call flush manually.

I think I did this already: in your recommendation, developers have to remove all links on the existing object, then reset its fields and links to match the desired state for the new object. Pretty elegant and nice to type code indeed … What else do you need as an example?

A Pull Request is worth 1000 words. Modify this example I used in my blog post to prove your point, and send a Pull Request with your changes.

Finally, I am still waiting for evidence that Hibernate’s flush order is needed for anything else than for performance reasons via JDBC batch updates …

Well, I already did that:

  1. Also, the order is because how parent side unidirectional do the INSERTS first and then the UPDATE since the child side does not know there’s a FK on their side. Probably there are other use cases as well.
  2. It’s actually more than that. It’s how all ORM work because, unlike ActiveRecord, an ORM executes statements on your behalf based on entity state changes.
  3. Hibernate does not record the user operation order. The changes generates events, which in turn generate operation actions. A unidirectional collection recreate will generate a join table DELETE where the FK matches the parent PK and many INSERTs, which, as you might have guessed, is also a code smell. Also, there’s no UPDATE equivalent in EntityManger. When would the UPDATE execute if everything were executed in the use call order?

#20

Just to recap: I just wanted to understand why Hibernate decided for its flush order. It was said that “the order was chosen to minimize the chances of constraint violations”, but I have not seen this illustrated anywhere.

Regarding this, you give your points 1, 2, and 3, but it would be of great help if you could illustrate 1 and 3, perhaps with Java code, and the SQL that would be generated if code change order was issued instead during the flush at transaction end, showing that doing so would trigger a constraint violation, and how this SQL is ordered by Hibernate to prevent it.

If you illustrate this to me, then I belive this whole long topic can be closed, and I leave this discussion with a clear idea behind Hibernate’s design decisions.

(And I still do not know what “multi-request logical transactions” really means … Any example?)


#21

Regarding this, you give your points 1, 2, and 3, but it would be of great help if you could illustrate 1 and 3, perhaps with Java code, and the SQL that would be generated if code change order was issued, thus triggering a constraint violation, and how this SQL is ordered by Hibernate to prevent it.

For 1, here’s an article which illustrates how unidirectional associations work. You’ll find both the Java code and the SQL being generated.

For 3, you’ll have to debug the Hibernate code to see what I’m talking about because it implies way too many components: Event, DefaultEventListeners, ActionQueue, FlushEventListener, etc.

Here’s an article about multi-request logical transactions.