Does Hibernate provide a way to invalidate L2 cache

Hi,

I am using Hibernate L2 cache in my project and using EHCache as the caching provider. Hibernate version is 4.3.11 and EHCache 2.10.5.
The question I have is:
Hibernate will take care of updating the L2 cache when writes (inserts and updates) are done through Hibernate itself. Please correct me if I am wrong.

However when the database is updated through some other means, by some other application thus bypassing the Hibernate in my application, then Hibernate has no way to know that data has changed in the backend. In this case, the other application that updated the backend can send a notification to my application which can then use Hibernate API to invalidate the cache?

Or is there any other way to get around this?

If the DB is changed outside of Hibernate, you can use a CDC (Change Data Capture) approach using a tool like the open-source Debezium project to extract changes and propagate them to Hibernate.

Or, you can run some introspection queries periodically to verify the modification timestamps of the underlying cached entities.

Or, you can just set the cached entries TTL to a lower values so that cached results are short-lived and they can be updated frequently via re-fetching from the DB.

1 Like

Thanksā€¦ yes, currently we are using 3rd approach that is using TTL of 1 hour.

The 2nd approach will require modification to the application.

The 1st approach, I could understand how Debezium would extract database changes however the later part, can you please elaborate more on ā€˜propagate them to Hibernateā€™. How will this happen since the hibernate is running inside my Java application process and Debezium would run in a different process I believe. How they will communicate?

They will communicate via Kafka. Your application will have to read the Kafka log and make sure it updates the cache accordingly.

Check out this article, for more details about using Debezium, Kafka and Hibernate.

If you are doing changes in Hibernate entities itself, you donā€™t have to do anything else to ensure the consistency of L2 cache, Hibernate will take care of it.

If you are doing changes via native queries, then explicitly mention which entities are affected, otherwise Hibernate will invalidate the entire second-level cache.

If you are changing data in the database from another process, then Hibernate is not aware of it, and you will have to define a strategy that best suits your requirements app (expiration policies, explicit invalidation called from the outside of the application, etc).

1 Like

To elaborate on the suggested Debezium alternative, I think 2nd level cache invalidation is a great use case for leveraging Debeziumā€™s embedded engine instead of running it via Kafka. In this mode of operation, Debezium runs within your application itself and a callback method is invoked whenever a change event arrives. This handler then would invoke the {{Cache#evict()}} method for the given entity id.

Iā€™ve filed DBZ-991 in our tracker for creating a blog post on this, at it poses some interesting challenges, e.g. how to materialize the right id type from the change message. Not sure when weā€™ll get to this, but I hope weā€™ll find some time for writing this up soonish.

1 Like

Cool feature. I should definitely give it a try too.

Nice feature. That would be much more user friendly than integrating through Kafka. I would appreciate if you could create a blog on the same

Hereā€™s the promised blog post which shows how to invalidate 2LC items via Debezium and change data capture:

https://debezium.io/blog/2018/12/05/automating-cache-invalidation-with-change-data-capture/

Thereā€™s also a complete demo project in our examples repo, which is linked from the post. Feedback very welcome!

Vlad,

I was wondering if below could be a feature provided by Hibernate:
For Hibernate Query Cache, hibernate could integrate with in process Debezium and based on the events received, the Hibernate would invalidate the query cache internally.
I think this would be a very useful feature and would extend to architectures in which the database gets updated from other applications.

That could only be done via a separate Hibernate module like hibernate-debezium.

Using the embedded mode, it would be easier to implement it, but Iā€™m not sure if Zookeeper is still needed to operate as well. Maybe @gunnar.morling can shed some light on this idea.

For Hibernate Query Cache, hibernate could integrate with in process Debezium and based on the events received, the Hibernate would invalidate the query cache internally.

Yes, the same approach discussed in the blog post could be used for not only invalidating items in the 2nd-level cache but for invalidating query cache regions, too.

Iā€™m not sure if Zookeeper is still needed to operate as well

No, ZK is not needed when running Debezium in library mode. All you need is the Debezium core JAR and the Debezium connector for your database.

Essentially, itā€™s ā€œonlyā€ a matter of finding someone who is willing to spend the time to work on this, any contributions will be very much welcomed. Iā€™ve added a comment to DBZ-911 which describes the steps needed to create a generic implementation of this approach.

I think invalidation of L2 cache will be much easier than the query cache since there is more to the invalidation of query cache.

I donā€™t think itā€™s much more difficult. Essentially for the query results cache we just need to obtain the affected entity type and trigger invalidation via the TimestampsCache. So in fact itā€™s even simpler than invalidating specific 2LC entries.

Would you perhaps be interested in giving it a try? The sample code from the blog post is here. It could be the starting point for a more generic, ready-made implementation.

Thanksā€¦ however due to some other commitments i wonā€™t be able to spend enough time on this. Also, I am part of Hibernate user community and not the developer. I would recommend someone who has a good knowledge of Hibernate code to take this up.

Unfortunately, there are hundreds of issues to be fixed, new features to be implemented, documentation to be updates, answers to be given on the forum, so the Hibernate team is most of the time busy doing all these tasks.

This is an open-source project, so the community should also be interested in making it better, right?

Could I evict the 2level cache when updating the database in another process? For an example when I have a relationship:
Employee(1) - (n)DepartmentMemberships(n) - (1)Department

When I remove the department, the DepartmentMemberships will be removed as well (cascade Remove), then I want to the 2level cache of Employee will be invalidating(because the DepartmentMemberships are fetched with Employee).

How to delete cache files from temporary files?
and Does it affect computerā€™s speed?