Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-74973

Deadlock between LazyLoadRunMapEntrySet.clearCache and AbstractLazyLoadRunMap.all

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Minor Minor
    • core
    • None

      I first reported this in https://github.com/jenkinsci/prometheus-plugin/issues/717, but am more confident that this issue is best unraveled in Jenkins core.

      From what I can tell, a deadlock can occur when a thread starts a new build while another thread gets the size of all builds.

      The stack traces of the threads that appeared to be deadlocking each other are the following:

          "Executor #-1 for Built-In Node" Id=138915 Group=main BLOCKED on jenkins.model.lazy.LazyLoadRunMapEntrySet@1bcc66c7 owned by "prometheus_async_worker thread" Id=138789
          	at jenkins.model.lazy.LazyLoadRunMapEntrySet.clearCache(LazyLoadRunMapEntrySet.java:37)
          	-  blocked on jenkins.model.lazy.LazyLoadRunMapEntrySet@1bcc66c7
          	at jenkins.model.lazy.AbstractLazyLoadRunMap.put(AbstractLazyLoadRunMap.java:625)
          	-  locked hudson.model.RunMap@3f263d52
          	at jenkins.model.lazy.AbstractLazyLoadRunMap._put(AbstractLazyLoadRunMap.java:606)
          	at hudson.model.RunMap.put(RunMap.java:227)
          	at jenkins.model.lazy.LazyBuildMixIn.newBuild(LazyBuildMixIn.java:192)
          	-  locked org.jenkinsci.plugins.workflow.job.WorkflowJob$1@149939d4
          	at jenkins.model.ParameterizedJobMixIn$ParameterizedJob.createExecutable(ParameterizedJobMixIn.java:507)
          	at jenkins.model.ParameterizedJobMixIn$ParameterizedJob.createExecutable(ParameterizedJobMixIn.java:325)
          	at hudson.model.Executor$1.call(Executor.java:374)
          	at hudson.model.Executor$1.call(Executor.java:354)
          	at hudson.model.Queue._withLock(Queue.java:1470)
          	at hudson.model.Queue.withLock(Queue.java:1326)
          	at hudson.model.Executor.run(Executor.java:354)
          
          	Number of locked synchronizers = 1
          	- java.util.concurrent.locks.ReentrantLock$NonfairSync@323aed2d 

      and

          "prometheus_async_worker thread" Id=138789 Group=main BLOCKED on hudson.model.RunMap@3f263d52 owned by "Executor #-1 for Built-In Node" Id=138915
          	at jenkins.model.lazy.AbstractLazyLoadRunMap.all(AbstractLazyLoadRunMap.java:655)
          	-  blocked on hudson.model.RunMap@3f263d52
          	at jenkins.model.lazy.LazyLoadRunMapEntrySet.all(LazyLoadRunMapEntrySet.java:32)
          	-  locked jenkins.model.lazy.LazyLoadRunMapEntrySet@1bcc66c7
          	at jenkins.model.lazy.LazyLoadRunMapEntrySet.size(LazyLoadRunMapEntrySet.java:42)
          	at java.base@17.0.13/java.util.AbstractMap.size(AbstractMap.java:85)
          	at java.base@17.0.13/java.util.Collections$UnmodifiableMap.size(Collections.java:1498)
          	at PluginClassLoader for prometheus//org.jenkinsci.plugins.prometheus.collectors.jobs.NbBuildsGauge.calculateMetric(NbBuildsGauge.java:33)
          	at PluginClassLoader for prometheus//org.jenkinsci.plugins.prometheus.collectors.jobs.NbBuildsGauge.calculateMetric(NbBuildsGauge.java:10)
          	at PluginClassLoader for prometheus//org.jenkinsci.plugins.prometheus.JobCollector.appendJobMetrics(JobCollector.java:231)
          	at PluginClassLoader for prometheus//org.jenkinsci.plugins.prometheus.JobCollector.lambda$collect$0(JobCollector.java:167)
          	at PluginClassLoader for prometheus//org.jenkinsci.plugins.prometheus.JobCollector$$Lambda$1184/0x00007fec84da47b0.accept(Unknown Source)
          	at PluginClassLoader for prometheus//org.jenkinsci.plugins.prometheus.util.Jobs.forEachJob(Jobs.java:19)
          	at PluginClassLoader for prometheus//org.jenkinsci.plugins.prometheus.JobCollector.collect(JobCollector.java:156)
          	at PluginClassLoader for prometheus//io.prometheus.client.Collector.collect(Collector.java:45)
          	at PluginClassLoader for prometheus//io.prometheus.client.CollectorRegistry$MetricFamilySamplesEnumeration.findNextElement(CollectorRegistry.java:204)
          	at PluginClassLoader for prometheus//io.prometheus.client.CollectorRegistry$MetricFamilySamplesEnumeration.nextElement(CollectorRegistry.java:219)
          	at PluginClassLoader for prometheus//io.prometheus.client.CollectorRegistry$MetricFamilySamplesEnumeration.nextElement(CollectorRegistry.java:152)
          	at java.base@17.0.13/java.util.Enumeration$1.next(Enumeration.java:123)
          	at PluginClassLoader for prometheus//org.jenkinsci.plugins.prometheus.config.disabledmetrics.FilteredMetricEnumeration.filterList(FilteredMetricEnumeration.java:21)
          	at PluginClassLoader for prometheus//org.jenkinsci.plugins.prometheus.config.disabledmetrics.FilteredMetricEnumeration.<init>(FilteredMetricEnumeration.java:15)
          	at PluginClassLoader for prometheus//org.jenkinsci.plugins.prometheus.service.DefaultPrometheusMetrics.collectMetrics(DefaultPrometheusMetrics.java:101)
          	at PluginClassLoader for prometheus//org.jenkinsci.plugins.prometheus.service.PrometheusAsyncWorker.execute(PrometheusAsyncWorker.java:35)
          	at hudson.model.AsyncPeriodicWork.lambda$doRun$0(AsyncPeriodicWork.java:102)
          	at hudson.model.AsyncPeriodicWork$$Lambda$1112/0x00007fec84d5dc30.run(Unknown Source)
          	at java.base@17.0.13/java.lang.Thread.run(Thread.java:840) 

            Unassigned Unassigned
            tzhu tzhu
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated: