Bulk delete in Hibernate does not delete rows for some collection properties

I’m using Hibernate 5.2.12 with a PostgreSQL 9.x database, and I’m using hbm2ddl to generate the schema. I’m trying to optimize a persistence layer that doesn’t scale very well by doing bulk deletes instead of relying on cascading.

However, I found that bulk deletion in Hibernate will take care of some things and not others:

  • yes mapped class table
  • yes subclass tables
  • yes superclass tables
  • yes table for collection properties with a many-to-many association
  • NO table for other collection properties (such as values or composites)

Below is a simplified example of the situation; sadly, we are still stuck in HBM instead of using annotations.

<hibernate-mapping>
	<class abstract="true" 
			table="Resource" 
			name="com.example.RepoResource">
		<id name="id" type="long">
			<generator class="native"/>
		</id>
        <many-to-one column="parent_folder" name="parent"/> <!-- ref to another persistent type -->
<!-- more stuff -->
    </class>

	<joined-subclass 
			name="com.example.RepoListOfValues"
			extends="com.example.RepoResource"
			table="ListOfValues">
		<key column="id"/>
		<list table="ListOfValuesItem" name="values" fetch="subselect">
			<key column="id"/>
			<list-index column="idx"/>
			<composite-element class="com.example.RepoListOfValuesItem">
				<property name="label"/>
				<property name="value" type="serializable" length="20971520"/>
			</composite-element>
		</list>
		<list table="ListOfValuesRef" name="valueRefs" fetch="subselect">
			<key column="id"/>
			<list-index column="idx"/>
			<many-to-many column="input_control_id" class="com.example.RepoResource"/>
		</list>
	</joined-subclass>
</hibernate-mapping>

This file maps four tables:

  • Resource maps RepoResource, the base class in the hierarchy
  • ListOfValues maps RepoListOfValues, a subclass of RepoResource
  • ListOfValuesItem maps elements of type RepoListOfValuesItem in the values list
  • ListOfValuesRef maps items in the valueRefs list, creating a many-to-many association to other RepoResource objects

If I run an HQL delete which uses a subquery (leaving out details of the subquery):

delete from ListOfValues where parent in (...)

This will use the default bulk delete strategy which uses a temp table; the SQL looks something like this:

create temporary  table HT_ListOfValues (id int8 not null) 
insert into HT_ListOfValues select ...
delete from ListOfValuesRef where (id) IN (select id from HT_ListOfValues)
delete from ListOfValues where (id) IN (select id from HT_ListOfValues)
delete from Resource where (id) IN (select id from HT_ListOfValues)

There’s no deletion from ListOfValuesItem, but there’s a constraint on the reference from ListOfValuesItem to ListOfValues, so when we try to delete the ListOfValues rows, we hit a constraint violation.
It seems like this is just missing from the bulk delete implementation; I drilled into the latest version of TableBasedDeleteHandlerImpl.java which generates the SQL. This loop iterates through the properties of the persistent class, but only generates delete statements for many-to-many:

	// If many-to-many, delete the FK row in the collection table.
		// This partially overlaps with DeleteExecutor, but it instead uses the temp table in the idSubselect.
		for ( Type type : targetedPersister.getPropertyTypes() ) {
			if ( type.isCollectionType() ) {
				CollectionType cType = (CollectionType) type;
				AbstractCollectionPersister cPersister = (AbstractCollectionPersister) factory.getMetamodel().collectionPersister( cType.getRole() );
				if ( cPersister.isManyToMany() ) {
					deletes.add( generateDelete( cPersister.getTableName(),
							cPersister.getKeyColumnNames(), generateIdSubselect( targetedPersister, cPersister, idTableInfo ), "bulk delete - m2m join table cleanup"));
				}
			}
		}

Now I’m stuck–the only way I’ve been able to work around this has been to drop the constraints on the collection table (in this example ListOfValuesItem).
I don’t know of any way to delete these in bulk through HQL or I’d do that; it looks like I’ll have to run some native SQL to clean up.

Can someone confirm that this is a bug?

Can anyone suggest a better workaround?