Performance issues when enabling micrometer metrics

We are observing a plenty of following execution stacks in load testing when micrometer metrics are enabled:

"http-nio-8080-exec-1" #99 daemon prio=5 os_prio=0 tid=0x00007fa1f09d1800 nid=0x72 runnable [0x00007fa13ddeb000]
   java.lang.Thread.State: RUNNABLE
	at io.micrometer.core.instrument.Tags.<init>(Tags.java:44)
	at io.micrometer.core.instrument.Tags.and(Tags.java:103)
	at io.micrometer.core.instrument.Tags.and(Tags.java:121)
	at io.micrometer.core.instrument.FunctionTimer$Builder.tags(FunctionTimer.java:115)
	at io.micrometer.core.instrument.FunctionTimer$Builder.tags(FunctionTimer.java:107)
	at org.hibernate.stat.HibernateQueryMetrics$MetricsEventHandler.registerQueryMetric(HibernateQueryMetrics.java:150)
	at org.hibernate.stat.HibernateQueryMetrics$MetricsEventHandler.onPostLoad(HibernateQueryMetrics.java:105)

I suspect the root cause is implementation of org.hibernate.stat.HibernateQueryMetrics$MetricsEventHandler.onPostLoad method: for some reason it injects excessive query statistics when when Hibernate loads entities.

Hey there, thanks for the report. I think it might be best if you come join us in our Zulip Chat to discuss this. I don’t know OTOH who contributed this originally and what the idea behind this was exactly, but in the chat we can definitely figure this out, though you have to understand that this might have to be delayed until January as many people are off for holidays.

Hm…

I have taken closer look on that and it seems the story is following: code was contributed to Hibernate by micrometer team (you may find original code on GH, original CR is here, no discussion though) HibernateMetrics class actually looks good, however HibernateQueryMetrics class seems to be problematic - it tries to register performance monitor for every query being executed, but:

  • it misuses load event - I believe micrometer developers were initially need some kind of API on Hibernate side for that (somewhere near StatisticsImplementor#queryExecuted), but when the code was adopted by Hibernate nobody had recalled that
  • since it registers performance monitor for every query being executed and never removes those monitors, it is obvious that such implementation causes memory leaks: application may perform millions of ah-hoc queries
  • it produces wrong results: Hibernate uses BoundedConcurrentHashMap for query statistics, which in turn means query statistics sometimes may be removed and later added again as brand new QueryStatistics objects, however HibernateQueryMetrics fails to catch such changes