-
Bug
-
Resolution: Unresolved
-
Minor
-
None
-
core and plugins fully up-to-date (e.g. jenkins core 2.120)
As suggested by svanoort – based upon:
- https://issues.jenkins-ci.org/browse/JENKINS-34256?focusedCommentId=337057&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-337057
I would open a separate issue for Timeouts suspending executions - especially if you can come up with a consistent way to reproduce it. I saw it from time to time with Pipelines doing very complex processing (where we can't block the shutdown forever and shouldn't).
My suspicion is that there's a subtle bug around the halt-at-shutdown logic, which may have been pre-existing but is visible now because the process is more closely monitored and logged now (also because we actually have some test coverage for it). Unfortunately
- https://issues.jenkins-ci.org/browse/JENKINS-34256?focusedCommentId=336885&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-336885
Steps to re-produce:
- Start scripted pipeline, e.g.:
#!/usr/bin/env groovy stage('stage') { timestamps { echo('In stage "stage"...') node { sh 'sleep 90' } echo('After sleeping...') } }
- Enter shutdown mode during "sh 'sleep ...'" step
- Restart Jenkins
- Now I am not sure if this is relevant or not, but: in my case there is an init.d groovy hook script that is setting Jenkins in shutdown mode right at the start-up of Jenkins...
- Not sure if it is sufficient to put it manually in shutdown mode as well, but I could imagine!?
- After successful (first) Jenkins restart (i.e. Jenkins logs "Jenkins is fully up and running" and is in shutdown mode) trigger the second restart
- Thread dump before: thread-dump-before.txt
- Thread dump a few seconds after triggering restart ( thread-dump-after.txt ), the interesting threads:
"Thread-1" #19 prio=5 os_prio=0 tid=0x00007f982804f800 nid=0x6b3c waiting on condition [0x00007f97bdde5000] java.lang.Thread.State: TIMED_WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x00000000fedec7e0> (a com.google.common.util.concurrent.AbstractFuture$Sync) at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037) at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328) at com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:258) at com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:91) at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution.suspendAll(CpsFlowExecution.java:1555) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at hudson.init.TaskMethodFinder.invoke(TaskMethodFinder.java:104) at hudson.init.TaskMethodFinder$TaskImpl.run(TaskMethodFinder.java:175) at org.jvnet.hudson.reactor.Reactor.runTask(Reactor.java:296) at org.jvnet.hudson.reactor.Reactor$2.run(Reactor.java:214) at org.jvnet.hudson.reactor.Reactor$Node.run(Reactor.java:117) at jenkins.model.Jenkins$18.execute(Jenkins.java:3338) at org.jvnet.hudson.reactor.Reactor$Node.runIfPossible(Reactor.java:139) at org.jvnet.hudson.reactor.Reactor$Node.run(Reactor.java:128) - locked <0x00000000e399c808> (a org.jvnet.hudson.reactor.Reactor) at jenkins.model.Jenkins$18.execute(Jenkins.java:3338) at org.jvnet.hudson.reactor.Reactor$Node.runIfPossible(Reactor.java:139) at org.jvnet.hudson.reactor.Reactor.execute(Reactor.java:276) - locked <0x00000000e399c808> (a org.jvnet.hudson.reactor.Reactor) at jenkins.model.Jenkins._cleanUpRunTerminators(Jenkins.java:3335) at jenkins.model.Jenkins.cleanUp(Jenkins.java:3256) at hudson.WebAppMain.contextDestroyed(WebAppMain.java:379) at org.eclipse.jetty.server.handler.ContextHandler.callContextDestroyed(ContextHandler.java:898) at org.eclipse.jetty.servlet.ServletContextHandler.callContextDestroyed(ServletContextHandler.java:545) at org.eclipse.jetty.server.handler.ContextHandler.stopContext(ContextHandler.java:873) at org.eclipse.jetty.servlet.ServletContextHandler.stopContext(ServletContextHandler.java:355) at org.eclipse.jetty.webapp.WebAppContext.stopWebapp(WebAppContext.java:1521) at org.eclipse.jetty.webapp.WebAppContext.stopContext(WebAppContext.java:1485) at org.eclipse.jetty.server.handler.ContextHandler.doStop(ContextHandler.java:927) at org.eclipse.jetty.servlet.ServletContextHandler.doStop(ServletContextHandler.java:271) at org.eclipse.jetty.webapp.WebAppContext.doStop(WebAppContext.java:569) at org.eclipse.jetty.util.component.AbstractLifeCycle.stop(AbstractLifeCycle.java:89) - locked <0x00000000865c0ed0> (a java.lang.Object) at org.eclipse.jetty.util.component.ContainerLifeCycle.stop(ContainerLifeCycle.java:144) at org.eclipse.jetty.util.component.ContainerLifeCycle.doStop(ContainerLifeCycle.java:162) at org.eclipse.jetty.server.handler.AbstractHandler.doStop(AbstractHandler.java:124) at org.eclipse.jetty.server.Server.doStop(Server.java:489) at org.eclipse.jetty.util.component.AbstractLifeCycle.stop(AbstractLifeCycle.java:89) - locked <0x00000000865b0c78> (a java.lang.Object) at winstone.Launcher.shutdown(Launcher.java:307) at winstone.ShutdownHook.run(ShutdownHook.java:25) "SIGTERM handler" #105 daemon prio=9 os_prio=0 tid=0x00007f9820003000 nid=0x6b38 in Object.wait() [0x00007f982ccbb000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) at java.lang.Thread.join(Thread.java:1252) - locked <0x00000000863b7778> (a winstone.ShutdownHook) at java.lang.Thread.join(Thread.java:1326) at java.lang.ApplicationShutdownHooks.runHooks(ApplicationShutdownHooks.java:106) at java.lang.ApplicationShutdownHooks$1.run(ApplicationShutdownHooks.java:46) at java.lang.Shutdown.runHooks(Shutdown.java:123) at java.lang.Shutdown.sequence(Shutdown.java:167) at java.lang.Shutdown.exit(Shutdown.java:212) - locked <0x0000000086319db8> (a java.lang.Class for java.lang.Shutdown) at java.lang.Terminator$1.handle(Terminator.java:52) at sun.misc.Signal$1.run(Signal.java:212) at java.lang.Thread.run(Thread.java:748)
So my guess was "... maybe because of running into timeouts when suspending (the more or less still suspended pipelines; because first action in init.d hook scripts is setting Jenkins in shutdown mode)!?"
- is related to
-
JENKINS-34256 Preparing Jenkins For Shutdown Hangs Running Pipelines
- Resolved