-
Bug
-
Resolution: Fixed
-
Critical
It seems like there is a deadlock between support core plugin and workflow-step-api-plugin.
Because of this issue, the UI becomes unresponsive which makes Jenkins admins to think the instance is down.
Found one Java-level deadlock: ============================= "Computer.threadPoolForRemoting [#25]": waiting to lock monitor 0x00007fa024cfffd8 (object 0x0000000142c07458, a java.util.logging.StreamHandler), which is held by "Running CpsFlowExecution[Owner[SDOM/Builds/P1208/P1208__5.5.0/9:SDOM/Builds/P1208/P1208__5.5.0 #9]]" "Running CpsFlowExecution[Owner[SDOM/Builds/P1208/P1208__5.5.0/9:SDOM/Builds/P1208/P1208__5.5.0 #9]]": waiting to lock monitor 0x00007f9ff0031568 (object 0x00000001407bbe98, a hudson.util.MaskingClassLoader), which is held by "Executor #4 for SI-Z0S4D : executing PlaceholderExecutable:ExecutorStepExecution.PlaceholderTask{runId=SDOM/Builds/P1208/P1208__5.5.0#9,context=CpsStepContext[71]:Owner[SDOM/Builds/P1208/P1208__5.5.0/9:SDOM/Builds/P1208/P1208__5.5.0 #9],cookie=6ef58131-2e13-4185-b552-c9be42e7949f}" "Executor #4 for SI-Z0S4D : executing PlaceholderExecutable:ExecutorStepExecution.PlaceholderTask{runId=SDOM/Builds/P1208/P1208__5.5.0#9,context=CpsStepContext[71]:Owner[SDOM/Builds/P1208/P1208__5.5.0/9:SDOM/Builds/P1208/P1208__5.5.0 #9],cookie=6ef58131-2e13-4185-b552-c9be42e7949f}": waiting to lock monitor 0x00007fa024cfffd8 (object 0x0000000142c07458, a java.util.logging.StreamHandler), which is held by "Running CpsFlowExecution[Owner[SDOM/Builds/P1208/P1208__5.5.0/9:SDOM/Builds/P1208/P1208__5.5.0 #9]]" Java stack information for the threads listed above: =================================================== "Computer.threadPoolForRemoting [#25]": at java.util.logging.StreamHandler.publish(Unknown Source) - waiting to lock <0x0000000142c07458> (a java.util.logging.StreamHandler) at com.cloudbees.jenkins.support.impl.JenkinsLogs$LogFile.publish(JenkinsLogs.java:213) at com.cloudbees.jenkins.support.impl.JenkinsLogs$CustomHandler.publish(JenkinsLogs.java:252) at java.util.logging.Logger.log(Unknown Source) at java.util.logging.Logger.doLog(Unknown Source) at java.util.logging.Logger.log(Unknown Source) at java.util.logging.Logger.fine(Unknown Source) at hudson.remoting.Channel$1.handle(Channel.java:499) at hudson.remoting.AbstractByteArrayCommandTransport$1.handle(AbstractByteArrayCommandTransport.java:61) at org.jenkinsci.remoting.nio.NioChannelHub$2.run(NioChannelHub.java:594) at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at java.util.concurrent.FutureTask.run(Unknown Source) at hudson.remoting.SingleLaneExecutorService$1.run(SingleLaneExecutorService.java:112) at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28) at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) "Running CpsFlowExecution[Owner[SDOM/Builds/P1208/P1208__5.5.0/9:SDOM/Builds/P1208/P1208__5.5.0 #9]]": at hudson.util.MaskingClassLoader.loadClass(MaskingClassLoader.java:70) - waiting to lock <0x00000001407bbe98> (a hudson.util.MaskingClassLoader) at java.lang.ClassLoader.loadClass(Unknown Source) - locked <0x00000001409dda60> (a hudson.ClassicPluginStrategy$DependencyClassLoader) at java.lang.ClassLoader.loadClass(Unknown Source) at jenkins.util.AntClassLoader.findBaseClass(AntClassLoader.java:1398) at jenkins.util.AntClassLoader.loadClass(AntClassLoader.java:1075) - locked <0x00000001409dd9f0> (a hudson.ClassicPluginStrategy$AntClassLoader2) at java.lang.ClassLoader.loadClass(Unknown Source) at org.jenkinsci.plugins.workflow.steps.StepExecution.getStatusBounded(StepExecution.java:132) at org.jenkinsci.plugins.workflow.steps.StepExecution.toString(StepExecution.java:107) at java.lang.String.valueOf(Unknown Source) at java.lang.StringBuilder.append(Unknown Source) at java.util.AbstractCollection.toString(Unknown Source) at java.util.Collections$UnmodifiableCollection.toString(Unknown Source) at java.text.MessageFormat.subformat(Unknown Source) at java.text.MessageFormat.format(Unknown Source) at java.text.Format.format(Unknown Source) at java.text.MessageFormat.format(Unknown Source) at java.util.logging.Formatter.formatMessage(Unknown Source) - locked <0x0000000142c07490> (a com.cloudbees.jenkins.support.SupportLogFormatter) at com.cloudbees.jenkins.support.SupportLogFormatter.format(SupportLogFormatter.java:101) at java.util.logging.StreamHandler.publish(Unknown Source) - locked <0x0000000142c07458> (a java.util.logging.StreamHandler) at com.cloudbees.jenkins.support.impl.JenkinsLogs$LogFile.publish(JenkinsLogs.java:213) at com.cloudbees.jenkins.support.impl.JenkinsLogs$CustomHandler.publish(JenkinsLogs.java:252) at java.util.logging.Logger.log(Unknown Source) at java.util.logging.Logger.doLog(Unknown Source) at java.util.logging.Logger.log(Unknown Source) at org.jenkinsci.plugins.workflow.flow.FlowExecutionList$ItemListenerImpl$1.onSuccess(FlowExecutionList.java:177) at org.jenkinsci.plugins.workflow.flow.FlowExecutionList$ItemListenerImpl$1.onSuccess(FlowExecutionList.java:174) at com.google.common.util.concurrent.Futures$6.run(Futures.java:975) at com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:253) at com.google.common.util.concurrent.ExecutionList$RunnableExecutorPair.execute(ExecutionList.java:149) at com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:134) at com.google.common.util.concurrent.AbstractFuture.set(AbstractFuture.java:170) at com.google.common.util.concurrent.SettableFuture.set(SettableFuture.java:53) at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution$5.onSuccess(CpsFlowExecution.java:712) at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution$5.onSuccess(CpsFlowExecution.java:690) at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution$4$1.run(CpsFlowExecution.java:627) at org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService$1.run(CpsVmExecutorService.java:35) at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at java.util.concurrent.FutureTask.run(Unknown Source) at hudson.remoting.SingleLaneExecutorService$1.run(SingleLaneExecutorService.java:112) at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28) at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) "Executor #4 for SI-Z0S4D : executing PlaceholderExecutable:ExecutorStepExecution.PlaceholderTask{runId=SDOM/Builds/P1208/P1208__5.5.0#9,context=CpsStepContext[71]:Owner[SDOM/Builds/P1208/P1208__5.5.0/9:SDOM/Builds/P1208/P1208__5.5.0 #9],cookie=6ef58131-2e13-4185-b552-c9be42e7949f}": at java.util.logging.StreamHandler.publish(Unknown Source) - waiting to lock <0x0000000142c07458> (a java.util.logging.StreamHandler) at com.cloudbees.jenkins.support.impl.JenkinsLogs$LogFile.publish(JenkinsLogs.java:213) at com.cloudbees.jenkins.support.impl.JenkinsLogs$CustomHandler.publish(JenkinsLogs.java:252) at java.util.logging.Logger.log(Unknown Source) at java.util.logging.Logger.doLog(Unknown Source) at java.util.logging.Logger.log(Unknown Source) at org.eclipse.jetty.util.log.JavaUtilLog.debug(JavaUtilLog.java:111) at org.eclipse.jetty.webapp.WebAppClassLoader.loadClass(WebAppClassLoader.java:459) - locked <0x000000014001e980> (a org.eclipse.jetty.webapp.WebAppClassLoader) at java.lang.ClassLoader.loadClass(Unknown Source) - locked <0x00000001407bbe98> (a hudson.util.MaskingClassLoader) at hudson.util.MaskingClassLoader.loadClass(MaskingClassLoader.java:75) - locked <0x00000001407bbe98> (a hudson.util.MaskingClassLoader) at java.lang.ClassLoader.loadClass(Unknown Source) - locked <0x0000000140c70fc0> (a hudson.ClassicPluginStrategy$DependencyClassLoader) at java.lang.ClassLoader.loadClass(Unknown Source) at jenkins.util.AntClassLoader.findBaseClass(AntClassLoader.java:1398) at jenkins.util.AntClassLoader.loadClass(AntClassLoader.java:1075) - locked <0x0000000140c70f50> (a hudson.ClassicPluginStrategy$AntClassLoader2) at java.lang.ClassLoader.loadClass(Unknown Source) at org.jenkinsci.plugins.workflow.support.actions.LogActionImpl.stream(LogActionImpl.java:88) at org.jenkinsci.plugins.workflow.support.DefaultStepContext.get(DefaultStepContext.java:73) at org.jenkinsci.plugins.workflow.support.steps.ExecutorStepExecution$PlaceholderTask$PlaceholderExecutable.run(ExecutorStepExecution.java:513) at hudson.model.ResourceController.execute(ResourceController.java:98) at hudson.model.Executor.run(Executor.java:410) Found 1 deadlock.
I was also able to capture the same issue on a different time windows:
Found one Java-level deadlock: ============================= "com.cloudbees.jenkins.plugin.metrics.views.Alerter thread": waiting to lock monitor 0x00007fbfc01452d8 (object 0x00000001450406d8, a java.util.logging.StreamHandler), which is held by "Running CpsFlowExecution[Owner[CI_MAINTENANCE/RESTART_MASTER/40:CI_MAINTENANCE/RESTART_MASTER #40]]" "Running CpsFlowExecution[Owner[CI_MAINTENANCE/RESTART_MASTER/40:CI_MAINTENANCE/RESTART_MASTER #40]]": waiting to lock monitor 0x00007fbfc0143f98 (object 0x0000000142400060, a org.eclipse.jetty.webapp.WebAppClassLoader), which is held by "jenkins.util.Timer [#1]" "jenkins.util.Timer [#1]": waiting to lock monitor 0x00007fbfc01452d8 (object 0x00000001450406d8, a java.util.logging.StreamHandler), which is held by "Running CpsFlowExecution[Owner[CI_MAINTENANCE/RESTART_MASTER/40:CI_MAINTENANCE/RESTART_MASTER #40]]" Java stack information for the threads listed above: =================================================== "com.cloudbees.jenkins.plugin.metrics.views.Alerter thread": at java.util.logging.StreamHandler.publish(Unknown Source) - waiting to lock <0x00000001450406d8> (a java.util.logging.StreamHandler) at com.cloudbees.jenkins.support.impl.JenkinsLogs$LogFile.publish(JenkinsLogs.java:213) at com.cloudbees.jenkins.support.impl.JenkinsLogs$CustomHandler.publish(JenkinsLogs.java:252) at java.util.logging.Logger.log(Unknown Source) at java.util.logging.Logger.doLog(Unknown Source) at java.util.logging.Logger.log(Unknown Source) at hudson.model.AsyncPeriodicWork$1.run(AsyncPeriodicWork.java:91) at java.lang.Thread.run(Unknown Source) "Running CpsFlowExecution[Owner[CI_MAINTENANCE/RESTART_MASTER/40:CI_MAINTENANCE/RESTART_MASTER #40]]": at org.eclipse.jetty.webapp.WebAppClassLoader.loadClass(WebAppClassLoader.java:410) - waiting to lock <0x0000000142400060> (a org.eclipse.jetty.webapp.WebAppClassLoader) at java.lang.ClassLoader.loadClass(Unknown Source) - locked <0x00000001436497b0> (a hudson.util.MaskingClassLoader) at hudson.util.MaskingClassLoader.loadClass(MaskingClassLoader.java:75) - locked <0x00000001436497b0> (a hudson.util.MaskingClassLoader) at java.lang.ClassLoader.loadClass(Unknown Source) - locked <0x00000001462ae218> (a hudson.ClassicPluginStrategy$DependencyClassLoader) at java.lang.ClassLoader.loadClass(Unknown Source) at jenkins.util.AntClassLoader.findBaseClass(AntClassLoader.java:1398) at jenkins.util.AntClassLoader.loadClass(AntClassLoader.java:1075) - locked <0x00000001439c6988> (a hudson.ClassicPluginStrategy$AntClassLoader2) at java.lang.ClassLoader.loadClass(Unknown Source) at org.jenkinsci.plugins.workflow.steps.StepExecution.getStatusBounded(StepExecution.java:142) at org.jenkinsci.plugins.workflow.steps.StepExecution.toString(StepExecution.java:107) at java.lang.String.valueOf(Unknown Source) at java.lang.StringBuilder.append(Unknown Source) at java.util.AbstractCollection.toString(Unknown Source) at java.util.Collections$UnmodifiableCollection.toString(Unknown Source) at java.text.MessageFormat.subformat(Unknown Source) at java.text.MessageFormat.format(Unknown Source) at java.text.Format.format(Unknown Source) at java.text.MessageFormat.format(Unknown Source) at java.util.logging.Formatter.formatMessage(Unknown Source) - locked <0x0000000145040710> (a com.cloudbees.jenkins.support.SupportLogFormatter) at com.cloudbees.jenkins.support.SupportLogFormatter.format(SupportLogFormatter.java:101) at java.util.logging.StreamHandler.publish(Unknown Source) - locked <0x00000001450406d8> (a java.util.logging.StreamHandler) at com.cloudbees.jenkins.support.impl.JenkinsLogs$LogFile.publish(JenkinsLogs.java:213) at com.cloudbees.jenkins.support.impl.JenkinsLogs$CustomHandler.publish(JenkinsLogs.java:252) at java.util.logging.Logger.log(Unknown Source) at java.util.logging.Logger.doLog(Unknown Source) at java.util.logging.Logger.log(Unknown Source) at org.jenkinsci.plugins.workflow.flow.FlowExecutionList$ItemListenerImpl$1.onSuccess(FlowExecutionList.java:177) at org.jenkinsci.plugins.workflow.flow.FlowExecutionList$ItemListenerImpl$1.onSuccess(FlowExecutionList.java:174) at com.google.common.util.concurrent.Futures$6.run(Futures.java:975) at com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:253) at com.google.common.util.concurrent.ExecutionList$RunnableExecutorPair.execute(ExecutionList.java:149) at com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:134) at com.google.common.util.concurrent.AbstractFuture.set(AbstractFuture.java:170) at com.google.common.util.concurrent.SettableFuture.set(SettableFuture.java:53) at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution$5.onSuccess(CpsFlowExecution.java:712) at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution$5.onSuccess(CpsFlowExecution.java:690) at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution$4$1.run(CpsFlowExecution.java:627) at org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService$1.run(CpsVmExecutorService.java:35) at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at java.util.concurrent.FutureTask.run(Unknown Source) at hudson.remoting.SingleLaneExecutorService$1.run(SingleLaneExecutorService.java:112) at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28) at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) "jenkins.util.Timer [#1]": at java.util.logging.StreamHandler.publish(Unknown Source) - waiting to lock <0x00000001450406d8> (a java.util.logging.StreamHandler) at com.cloudbees.jenkins.support.impl.JenkinsLogs$LogFile.publish(JenkinsLogs.java:213) at com.cloudbees.jenkins.support.impl.JenkinsLogs$CustomHandler.publish(JenkinsLogs.java:252) at java.util.logging.Logger.log(Unknown Source) at java.util.logging.Logger.doLog(Unknown Source) at java.util.logging.Logger.log(Unknown Source) at org.eclipse.jetty.util.log.JavaUtilLog.debug(JavaUtilLog.java:111) at org.eclipse.jetty.webapp.WebAppClassLoader.loadClass(WebAppClassLoader.java:459) - locked <0x0000000142400060> (a org.eclipse.jetty.webapp.WebAppClassLoader) at org.eclipse.jetty.webapp.WebAppClassLoader.loadClass(WebAppClassLoader.java:403) at hudson.util.Iterators.reverse(Iterators.java:168) at hudson.model.Queue.getItems(Queue.java:757) at org.jenkinsci.plugins.workflow.support.steps.ExecutorStepExecution.getStatus(ExecutorStepExecution.java:175) at org.jenkinsci.plugins.workflow.steps.StepExecution$1.call(StepExecution.java:134) at org.jenkinsci.plugins.workflow.steps.StepExecution$1.call(StepExecution.java:132) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(Unknown Source) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) Found 1 deadlock.
- blocks
-
JENKINS-39275 DurableTaskStep$Execution.getStatus can hang
-
- Resolved
-
- links to
Hard to be sure exactly what is going on, since the thread dumps do not indicate exactly why the body of getStatusBounded was triggering class loading—it is only using very generic classes that should long since have been loaded in Jenkins. It is possible the HotSpot compiler is eliding some important information.
The user in this case had a custom logger capturing ALL on the root component
, which included the FlowExecutionList logger at FINE triggering this issue.
The lesson seems to be that toString implementations may not do anything interesting, even with a timeout.