Best way to cascade thousands of objects?

Hi,

I’m facing a serious performance problem while trying to persist a root object that contains a large collection (100k objects). My application is ‘stuck’ for 20-30min between the call to myRepo.saveAndFlush(root) and the actual first SQL query in the logs.

Profiling the application show that all the time is spent in StatefulPersistenceContext.getOwnerId, which allocates a lot of temporary objects and triggers 1-2 GC/s. It looks like this method is called because of my bidirectional mapping, and the algorithm is not particularly well suited for ‘large’ collections. It might be the same problem as [HHH-1612] Serious performance lost within IdentityMap... - Hibernate JIRA.

I’m wondering what would be the best way to persist such a hierarchy: object A which contains a List<B> of 100k elements, and each B has a List<C> of < 10 elements.

My current mapping is:

<class name="A" table="a">
...
	<bag name="bList" inverse="true">
		<key column="a_uuid" not-null="true"/>
		<one-to-many class="B"/>
	</bag>
</class>

<class name="B" table="b">
	<bag name="cList" table="assoc_table" cascade="all-delete-orphan">
		<key column="b_uuid" not-null="true"/>
		<element column="el">
			<type name="..."/>
		</element>
	</bag>

	<many-to-one name="parentA" column="a_uuid"/>
</class>

I have configured batching, but the performance problem happens before SQL queries are run, so I don’t think it really matters.

Do I need to remove the cascade and persist the List<B> separately?

From what I can see, A and B are connected through a FK on B, so why do you need cascading from A to B? You can simply persist B objects that point to A through the FK.

Having said that, if you have an idea how to improve the performance for your use case, we are happily accepting PRs for that.