No Sessionproxy - no Session LazyInitializationException

Hey y’all, we are facing one of the worst things in software maintenance - sporadic failure of something that has worked wonderfully for decades, that is until we updated to Hibernate 6.6.1 from a v3 baseline; (i know a huge step skipping several generations but the old was working so well); anyhow, when lazy loading hundreds of thousands of one-to-many references each day, we may get a handful (1-5) of failures scattered capriciously all over our business model and webapp functionality; the exception trace commonly starts with

org.hibernate.LazyInitializationException: failed to lazily initialize a collection of role: xxx.someOneToManyReferences: no Sessionproxy - no Session
at org.hibernate.collection.spi.AbstractPersistentCollection.throwLazyInitializationException(AbstractPersistentCollection.java:636)
at org.hibernate.collection.spi.AbstractPersistentCollection.withTemporarySessionIfNeeded(AbstractPersistentCollection.java:219)
at org.hibernate.collection.spi.AbstractPersistentCollection.initialize(AbstractPersistentCollection.java:615)
at org.hibernate.collection.spi.AbstractPersistentCollection.read(AbstractPersistentCollection.java:138)
at org.hibernate.collection.spi.PersistentSet.iterator(PersistentSet.java:166)

triggered by the trifecta of the reference not being initialized (the lazy part), and looking into the ‘withTemporarySessionIfNeeded’ code the current session being null, and allowLoadOutsideTransaction being false;

seems the read->initialize->withTemporarySessionIfNeeded call chain presumably happens after the AbstractPersistentCollection session is set (on the holding object’s load) and then lost, with the setting on the nth request (and hibernate session construction) by the user and then the loss noted on the n+1th where the lazy load occurs

trying to avoid delving too deeply into the “black (box) magic” at the heart of hibernate O/R marshalling, seems that fetching the parent object under a session propagates the session for lazily known collections to be iterated over, either when that parent is pulled in cold from the database or retrieved from cache (for us ehcache) via identifier and either “now” (on the nth) or “later” (on the n+1th) request sequence;

investigating that propagation, setCurrentSession is invoked on deserialization and via the various xxxVisitors

where it’s unclear to me how those sequence out; could there be something that sporadically disrupts that session marking on the collection when pulled back in from cache - possibly; but assuming that session noting happens the question be would whether unsetSession is being called to null out the AbstractPersistentCollection session (seems to be the only way?); looks like there are not many usages of that method; can’t screenshot them (since evidently am only allowed a single image insert in the post) but they are

StatePersistenceContext : clear, addCollection
AbstractFlushingEventListener : posFlush
EvictVisitor: evictCollection
StatelessSessionImpl: fetch

the last one - the fetch which is presumably the real workhorse being exercised millions of times during the course of a week or so - is in a try/finally block that may be the culprit

                              try {
					collectionDescriptor.initialize( key, this );
					handlePotentiallyEmptyCollection( persistentCollection, getPersistenceContextInternal(), key,
							collectionDescriptor );
					final StatisticsImplementor statistics = getFactory().getStatistics();
					if ( statistics.isStatisticsEnabled() ) {
						statistics.fetchCollection( collectionDescriptor.getRole() );
					}
				}
				finally {
					persistentCollection.unsetSession( this );
					if ( persistenceContext.isLoadFinished() ) {
						persistenceContext.clear();
					}
				}

but if so, that only begs the question of what might be failing in the collectionDescriptor.initialize or handlePotentiallyEmptyCollection every so often; and it indicates that assuming the initialize proceeds as expected, the session will be nulled out and there will be no further traversal

the evict usage could also be the problem given that the application when reestablishing a new hibernate session for identifiers tucked away in the user level session (that transcends requests) does use eviction in certain cases with (re)gets out of the ehcached database to catch any updates other threads may have made;

which then brings up the whole possibility of threading; we are running in a tomcat servlet so there’s that inherent threading over the hibernate session; having just switched to ehcache from oscache, may have opened the door there; I read somewhere (after using hibernate for 20 years) that second level cache is to be used for objects that are never modified, i.e.read-only; not at all what we’ve been doing and so have we been “running with scissors” and are lucky not to be maimed? also read that ehcache may be configured for highly concurrent access as long as multiple threads are not updating the same element; am thinking hibernate layering takes care of that with timestamp checking

so having revealed all sorts of ignorance and given there is no simple test case that can be built to isolate and replicate the issue, may seem like a desperate plea for some fluky confluence of lesson learned or esoteric knowledge held in the community to turn the light bulb on; so I’ll throw this out while I brace myself to plunge on into putting in temporary probes in the hibernate code that will be all sorts of fun to wade through; not quite like looking for stray WIMPs of dark matter deep in underground mines but somehow that comes to mind

The only way you can run into trouble AFAICT is if the Hibernate Session which created the lazy collection was evicted through detach/evict/clear or the session was closed. Another possibility is during flush, if the collection is not referenced anymore through managed objects, then it will be detached as well.

How do you manage your Hibernate Session? When are these collection proxies created vs. initialized?