Excessive memory usage in org.hibernate.engine.internal.StatefulPersistenceContext

I have an application that uses Hibernate and it’s running out of memory with a medium volume dataset (~3 million records). When analysing the memory dump using Eclipse’s Memory Analyser I can see that StatefulPersistenceContext appears to be holding a copy of the record in memory in addition to the object itself, doubling the memory usage.

I’m able to reproduce this on a slightly smaller scale with a defined workflow, but am unable to simplify it to the level that I can put the full application here. The workflow is:

  1. Insert ~400,000 records (Fruit) into the database from a file
  2. Get all of the Fruits from the database and find if there are any complementary items to create ~150,000 Baskets (containing two Fruits)
  3. Retrieve all of the data - Fruits & Baskets - and save to a file

It’s running out of memory at the final stage, and the heap dump shows StatefulPersistenceContext has hundreds of thousands of Fruits in memory, in addition to the Fruits we retrieved to save to the file.

I’ve looked around online and the suggestion appears to be to use QueryHints.READ_ONLY on the query, or to set readOnly on the transaction - but neither of these seem to have stopped the massive StatefulPersistenceContext.

Is there something else I should be looking at?

Examples of the classes / queries I’m using:

public interface ShoppingService {
    public void createBaskets();

    public void loadFromFile(ObjectInput input);

    public void saveToFile(ObjectOutput output);
}
@Service
public class ShoppingServiceImpl implements ShoppingService {
    @Autowired
    private FruitDAO fDAO;

    @Autowired
    private BasketDAO bDAO;

    @Override
    public void createBaskets() {
        bDAO.add(Basket.generate(fDAO.getAll()));
    }

    @Override
    public void loadFromFile(ObjectInput input) {
        SavedState state = ((SavedState) input.readObject());

        fDAO.add(state.getFruits());
        bDAO.add(state.getBaskets());
    }

    @Override
    public void saveToFile(ObjectOutput output) {
        output.writeObject(new SavedState(fDAO.getAll(), bDAO.getAll()));
    }

    public static void main(String[] args) throws Throwable {
        ShoppingService service = null;

        try (ObjectInput input = new ObjectInputStream(new FileInputStream("path\\to\\input\\file"))) {
            service.loadFromFile(input);
        }

        service.createBaskets();

        try (ObjectOutput output = new ObjectOutputStream(new FileOutputStream("path\\to\\output\\file"))) {
            service.saveToFile(output);
        }
    }
}
@Entity
public class Fruit {
    @Id
    @GeneratedValue(strategy = GenerationType.SEQUENCE)
    private Long id;

    private String name;

    // ~ 200 string fields
}
public interface FruitDAO {
    public void add(Collection<Fruit> elements);

    public List<Fruit> getAll();
}
@Repository
public class JPAFruitDAO implements FruitDAO {
    @PersistenceContext
    private EntityManager em;

    @Override
    @Transactional()
    public void add(Collection<Fruit> elements) {
	elements.forEach(em::persist);
    }

    @Override
    public List<Fruit> getAll() {
	return em.createQuery("FROM Fruit", Fruit.class).getResultList();
    }
}
@Entity
public class Basket {
    @Id
    @GeneratedValue(strategy = GenerationType.SEQUENCE)
    private Long id;

    @OneToOne
    @JoinColumn(name = "arow")
    private Fruit aRow;

    @OneToOne
    @JoinColumn(name = "brow")
    private Fruit bRow;

    public static Collection<Basket> generate(List<Fruit> fruits) {
	// Some complicated business logic that does things
	return null;
    }
}
public interface BasketDAO {
    public void add(Collection<Basket> elements);

    public List<Basket> getAll();
}
@Repository
public class JPABasketDAO implements BasketDAO {
    @PersistenceContext
    private EntityManager em;

    @Override
    @Transactional()
    public void add(Collection<Basket> elements) {
	elements.forEach(em::persist);
    }

    @Override
    public List<Basket> getAll() {
	return em.createQuery("FROM Basket", Basket.class).getResultList();
    }
}
public class SavedState {
    private Collection<Fruit> fruits;
    private Collection<Basket> baskets;
}

Hibernate needs to keep the loaded state in the persistence context to support dirty flushing. Since you don’t seem to need this, you can use a StatelessSession instead. Alternatively, you can also just call EntityManager.clear() after your query to clear the persistence context manually. This will get rid of the loaded state.

I’m using Sprint data, so I don’t think StatelessSession is available to me, not on a per-query basis at least.
I tried using EntityManager.clear() and that didn’t seem to do anything?

I tried using EntityManager.clear() and that didn’t seem to do anything?

It clears the persistence context. So when you do

em.find(MyEntity.class, id)

the entity will be added to your persistence context. But when you call EntityManager.clear(), the persistence context will be cleared.

Just call em.clear() after getResultList() in your DAOs and this should be fine.

Thanks - I’ve been trying that this morning but I’m still seeing the same behaviour as before.

Because I can only reproduce it with a large volume of data, and I can’t work out how to view the content of the PersistenceContext outside of a memory dump, it’s difficult to investigate any further.

Well, the problem is that you are using managed entities but you should be using a DTO probably. You could also try something like this to remove entites from the persistence context as they are added:

em.createQuery("FROM Basket", Basket.class).getResultStream().map(b -> { em.clear(); return b }).collect(toList());

or you use the StatelessSession API:

public class JPABasketDAO implements BasketDAO {
    @PersistenceUnit
    private EntityManagerFactory emf;
    @PersistenceContext
    private EntityManager em;

    @Override
    @Transactional()
    public void add(Collection<Basket> elements) {
        elements.forEach(em::persist);
    }

    @Override
    public List<Basket> getAll() {
        try ( StatelessSession s = emf.unwrap(SessionFactory.class).openStatelessSession() ) {
            return s.createQuery("FROM Basket", Basket.class).getResultList();
        }
    }
}