Envers: strategy for high performance inserts

Happily using Envers. With ValidityAuditStrategy.

Now I need to add an additional table to the solution. This table needs audit shadow too, however it will have extreme requirements on insert performance, probably around 1000-2000 inserts per second at peak. The table itself will be huge, like zillions of rows.

It will happen very, very rarely that an existing row in this table will need to updated, and this action will need to be recorded. Since the solution is using Envers for just about all other tables in its schema, it seems natural to use Envers on this new table too.

But with the ValidityAuditStrategy, Envers will insert two rows for each and every INSERT operation into an audited table: one in the actual table and one its shadow (aud) table. Obviously this is not seen as a good idea on this new table because of the volume, the transactions per second and because of the fact that updates to existing rows will be a very, very rare event.

I’ve looked into some options:
I was hoping that the AuditStrategy could be overridden on a table-by-table basis. This is not currently the case: it applies globally.

I can override the action on PostInsertEvent on this table only and suppress the insertion into the aud table. However, if a row is later updated in the table, the Envers listener logic will (correctly) fail and throw exception because it cannot find the existing row to mark with an end-revision.

What are my options?

Thx

This is just how Envers works, I don’t think you can reasonable change/alter this behavior. If you want to record only the updates, I think you might be better off to create a trigger on the database and implement change tracking yourself.

Having said that, inserting a few thousand rows is not a big deal, so even if you double that amount because of Envers, it should be just fine. I’d recommend you to use the tool you know best and figure out what to do next if you actually have a problem. Chances are, you’ll never have a problem with the performance of this.

I appreciate your thoughts here on pre-mature optimization. :slight_smile:
The use case is not “a few thousand rows” (unless you mean the per-second use-case); it is billions of rows. The database, in this case, PostgreSQL, may actually be cool with the insert performance required but there’s also the storage cost to consider: With the ValidityAuditStrategy you have at least 2x the number of “actual” rows regardless if there are row changes or not. So it becomes 2x X billion rows. That would be a waste.

In conclusion, it would be natural if this particular table used DefaultAuditStrategy, while the others could continue to use ValidityAuditStrategy. This is not currently possible with Envers. Don’t know if supporting multiple “realms” (Envers configs) within the same EntityManager would be a major undertaking or not for Envers. , or if such use case is isolated to me.

I could maybe have 2 different DataSources to facilitate two different Envers configs (e.g. two different classes annotated with @RevisionEntity), but I wouldn’t know how to actually do it.

From a quick look at the source code, it looks possible to alter the code to support multiple audit strategies, but the question is how would you configure that? Maybe @Naros can chime in here and help with this feature request.