-
Bug
-
Resolution: Fixed
-
Minor
-
None
-
Jenkins LTS 2.176.2, 2.176.4, 2.190.1
running on oracle jdk8
I believe there is a memory leak in the prometheus-plugin.
I had upgraded the Jenkins version from 2.176.2 to 2.190.1, as well as upgraded the prometheus plugin from 2.0.0 to 2.0.6 (along with many other plugin upgrades)
Soon after, we started receiving high JVM memory usage alerts, with the service eventually OOM-ing. The memory pressure continued even after rolling back Jenkins versions and upping the heap size.
After going through groups of plugins to downgrade, narrowed it down to the prometheus-plugin that was causing the behavior described.
Interestingly, I tried duplicating the behavior on a staging instance and was unable to get the prometheus-plugin to work (no metrics actually provided). I'm not sure what combination of Jenkins versions and other plugin versions (dependencies?) I had to get prometheus-plugin 2.0.6 to work in the first place but I kept seeing this in the staging instance jenkins logs for prometheus v2.0.6
INFO: prometheus_async_worker thread is still running. Execution aborted.
This shows up in the logs every ~1-2 minutes when usually it only takes ~10000ms to complete the prometheus_async_worker to finish