-
Bug
-
Resolution: Unresolved
-
Minor
-
None
-
core and plugins fully up-to-date (e.g. jenkins core 2.120)
As suggested by svanoort – based upon:
- https://issues.jenkins-ci.org/browse/JENKINS-34256?focusedCommentId=337057&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-337057
I would open a separate issue for Timeouts suspending executions - especially if you can come up with a consistent way to reproduce it. I saw it from time to time with Pipelines doing very complex processing (where we can't block the shutdown forever and shouldn't).
My suspicion is that there's a subtle bug around the halt-at-shutdown logic, which may have been pre-existing but is visible now because the process is more closely monitored and logged now (also because we actually have some test coverage for it). Unfortunately
- https://issues.jenkins-ci.org/browse/JENKINS-34256?focusedCommentId=336885&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-336885
Steps to re-produce:
- Start scripted pipeline, e.g.:
#!/usr/bin/env groovy stage('stage') { timestamps { echo('In stage "stage"...') node { sh 'sleep 90' } echo('After sleeping...') } }
- Enter shutdown mode during "sh 'sleep ...'" step
- Restart Jenkins
- Now I am not sure if this is relevant or not, but: in my case there is an init.d groovy hook script that is setting Jenkins in shutdown mode right at the start-up of Jenkins...
- Not sure if it is sufficient to put it manually in shutdown mode as well, but I could imagine!?
- After successful (first) Jenkins restart (i.e. Jenkins logs "Jenkins is fully up and running" and is in shutdown mode) trigger the second restart
- Thread dump before: thread-dump-before.txt
- Thread dump a few seconds after triggering restart ( thread-dump-after.txt
), the interesting threads:
"Thread-1" #19 prio=5 os_prio=0 tid=0x00007f982804f800 nid=0x6b3c waiting on condition [0x00007f97bdde5000] java.lang.Thread.State: TIMED_WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x00000000fedec7e0> (a com.google.common.util.concurrent.AbstractFuture$Sync) at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037) at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328) at com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:258) at com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:91) at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution.suspendAll(CpsFlowExecution.java:1555) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at hudson.init.TaskMethodFinder.invoke(TaskMethodFinder.java:104) at hudson.init.TaskMethodFinder$TaskImpl.run(TaskMethodFinder.java:175) at org.jvnet.hudson.reactor.Reactor.runTask(Reactor.java:296) at org.jvnet.hudson.reactor.Reactor$2.run(Reactor.java:214) at org.jvnet.hudson.reactor.Reactor$Node.run(Reactor.java:117) at jenkins.model.Jenkins$18.execute(Jenkins.java:3338) at org.jvnet.hudson.reactor.Reactor$Node.runIfPossible(Reactor.java:139) at org.jvnet.hudson.reactor.Reactor$Node.run(Reactor.java:128) - locked <0x00000000e399c808> (a org.jvnet.hudson.reactor.Reactor) at jenkins.model.Jenkins$18.execute(Jenkins.java:3338) at org.jvnet.hudson.reactor.Reactor$Node.runIfPossible(Reactor.java:139) at org.jvnet.hudson.reactor.Reactor.execute(Reactor.java:276) - locked <0x00000000e399c808> (a org.jvnet.hudson.reactor.Reactor) at jenkins.model.Jenkins._cleanUpRunTerminators(Jenkins.java:3335) at jenkins.model.Jenkins.cleanUp(Jenkins.java:3256) at hudson.WebAppMain.contextDestroyed(WebAppMain.java:379) at org.eclipse.jetty.server.handler.ContextHandler.callContextDestroyed(ContextHandler.java:898) at org.eclipse.jetty.servlet.ServletContextHandler.callContextDestroyed(ServletContextHandler.java:545) at org.eclipse.jetty.server.handler.ContextHandler.stopContext(ContextHandler.java:873) at org.eclipse.jetty.servlet.ServletContextHandler.stopContext(ServletContextHandler.java:355) at org.eclipse.jetty.webapp.WebAppContext.stopWebapp(WebAppContext.java:1521) at org.eclipse.jetty.webapp.WebAppContext.stopContext(WebAppContext.java:1485) at org.eclipse.jetty.server.handler.ContextHandler.doStop(ContextHandler.java:927) at org.eclipse.jetty.servlet.ServletContextHandler.doStop(ServletContextHandler.java:271) at org.eclipse.jetty.webapp.WebAppContext.doStop(WebAppContext.java:569) at org.eclipse.jetty.util.component.AbstractLifeCycle.stop(AbstractLifeCycle.java:89) - locked <0x00000000865c0ed0> (a java.lang.Object) at org.eclipse.jetty.util.component.ContainerLifeCycle.stop(ContainerLifeCycle.java:144) at org.eclipse.jetty.util.component.ContainerLifeCycle.doStop(ContainerLifeCycle.java:162) at org.eclipse.jetty.server.handler.AbstractHandler.doStop(AbstractHandler.java:124) at org.eclipse.jetty.server.Server.doStop(Server.java:489) at org.eclipse.jetty.util.component.AbstractLifeCycle.stop(AbstractLifeCycle.java:89) - locked <0x00000000865b0c78> (a java.lang.Object) at winstone.Launcher.shutdown(Launcher.java:307) at winstone.ShutdownHook.run(ShutdownHook.java:25) "SIGTERM handler" #105 daemon prio=9 os_prio=0 tid=0x00007f9820003000 nid=0x6b38 in Object.wait() [0x00007f982ccbb000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) at java.lang.Thread.join(Thread.java:1252) - locked <0x00000000863b7778> (a winstone.ShutdownHook) at java.lang.Thread.join(Thread.java:1326) at java.lang.ApplicationShutdownHooks.runHooks(ApplicationShutdownHooks.java:106) at java.lang.ApplicationShutdownHooks$1.run(ApplicationShutdownHooks.java:46) at java.lang.Shutdown.runHooks(Shutdown.java:123) at java.lang.Shutdown.sequence(Shutdown.java:167) at java.lang.Shutdown.exit(Shutdown.java:212) - locked <0x0000000086319db8> (a java.lang.Class for java.lang.Shutdown) at java.lang.Terminator$1.handle(Terminator.java:52) at sun.misc.Signal$1.run(Signal.java:212) at java.lang.Thread.run(Thread.java:748)
- Thread dump before: thread-dump-before.txt
So my guess was "... maybe because of running into timeouts when suspending (the more or less still suspended pipelines; because first action in init.d hook scripts is setting Jenkins in shutdown mode)!?"
- is related to
-
JENKINS-34256 Preparing Jenkins For Shutdown Hangs Running Pipelines
-
- Resolved
-
[JENKINS-51215] Timeouts suspending pipeline executions (that are already suspended?)
Link |
New:
This issue is related to |
Description |
Original:
As suggested by [~svanoort] -- based upon: # https://issues.jenkins-ci.org/browse/JENKINS-34256?focusedCommentId=337057&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-337057 {quote} I would open a separate issue for Timeouts suspending executions - especially if you can come up with a consistent way to reproduce it. I saw it from time to time with Pipelines doing very complex processing (where we can't block the shutdown forever and shouldn't). My suspicion is that there's a subtle bug around the halt-at-shutdown logic, which may have been pre-existing but is visible now because the process is more closely monitored and logged now (also because we actually have some test coverage for it). Unfortunately {quote} # https://issues.jenkins-ci.org/browse/JENKINS-34256?focusedCommentId=336885&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-336885 Steps to re-produce: # Start scripted pipeline, e.g.: {code} #!/usr/bin/env groovy stage('stage') { timestamps { echo('In stage "stage"...') node { sh 'sleep 90' } echo('After sleeping...') } } {code} # Enter shutdown mode during "sh 'sleep ...'" step # Restart Jenkins # _Now I am not sure if this is relevant or not, but:_ in my case there is an init.d groovy hook script that is setting Jenkins in shutdown mode right at the start-up of Jenkins... #* Not sure if it is sufficient to put it manually in shutdown mode as well, but I could imagine!? # After successful (first) Jenkins restart (i.e. Jenkins logs "Jenkins is fully up and running" and is in shutdown mode) trigger the second restart #* Thread dump before: #* Thread dump a few seconds after triggering restart, the interesting threads: {noformat} "Thread-1" #19 prio=5 os_prio=0 tid=0x00007f982804f800 nid=0x6b3c waiting on condition [0x00007f97bdde5000] java.lang.Thread.State: TIMED_WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x00000000fedec7e0> (a com.google.common.util.concurrent.AbstractFuture$Sync) at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037) at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328) at com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:258) at com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:91) at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution.suspendAll(CpsFlowExecution.java:1555) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at hudson.init.TaskMethodFinder.invoke(TaskMethodFinder.java:104) at hudson.init.TaskMethodFinder$TaskImpl.run(TaskMethodFinder.java:175) at org.jvnet.hudson.reactor.Reactor.runTask(Reactor.java:296) at org.jvnet.hudson.reactor.Reactor$2.run(Reactor.java:214) at org.jvnet.hudson.reactor.Reactor$Node.run(Reactor.java:117) at jenkins.model.Jenkins$18.execute(Jenkins.java:3338) at org.jvnet.hudson.reactor.Reactor$Node.runIfPossible(Reactor.java:139) at org.jvnet.hudson.reactor.Reactor$Node.run(Reactor.java:128) - locked <0x00000000e399c808> (a org.jvnet.hudson.reactor.Reactor) at jenkins.model.Jenkins$18.execute(Jenkins.java:3338) at org.jvnet.hudson.reactor.Reactor$Node.runIfPossible(Reactor.java:139) at org.jvnet.hudson.reactor.Reactor.execute(Reactor.java:276) - locked <0x00000000e399c808> (a org.jvnet.hudson.reactor.Reactor) at jenkins.model.Jenkins._cleanUpRunTerminators(Jenkins.java:3335) at jenkins.model.Jenkins.cleanUp(Jenkins.java:3256) at hudson.WebAppMain.contextDestroyed(WebAppMain.java:379) at org.eclipse.jetty.server.handler.ContextHandler.callContextDestroyed(ContextHandler.java:898) at org.eclipse.jetty.servlet.ServletContextHandler.callContextDestroyed(ServletContextHandler.java:545) at org.eclipse.jetty.server.handler.ContextHandler.stopContext(ContextHandler.java:873) at org.eclipse.jetty.servlet.ServletContextHandler.stopContext(ServletContextHandler.java:355) at org.eclipse.jetty.webapp.WebAppContext.stopWebapp(WebAppContext.java:1521) at org.eclipse.jetty.webapp.WebAppContext.stopContext(WebAppContext.java:1485) at org.eclipse.jetty.server.handler.ContextHandler.doStop(ContextHandler.java:927) at org.eclipse.jetty.servlet.ServletContextHandler.doStop(ServletContextHandler.java:271) at org.eclipse.jetty.webapp.WebAppContext.doStop(WebAppContext.java:569) at org.eclipse.jetty.util.component.AbstractLifeCycle.stop(AbstractLifeCycle.java:89) - locked <0x00000000865c0ed0> (a java.lang.Object) at org.eclipse.jetty.util.component.ContainerLifeCycle.stop(ContainerLifeCycle.java:144) at org.eclipse.jetty.util.component.ContainerLifeCycle.doStop(ContainerLifeCycle.java:162) at org.eclipse.jetty.server.handler.AbstractHandler.doStop(AbstractHandler.java:124) at org.eclipse.jetty.server.Server.doStop(Server.java:489) at org.eclipse.jetty.util.component.AbstractLifeCycle.stop(AbstractLifeCycle.java:89) - locked <0x00000000865b0c78> (a java.lang.Object) at winstone.Launcher.shutdown(Launcher.java:307) at winstone.ShutdownHook.run(ShutdownHook.java:25) "SIGTERM handler" #105 daemon prio=9 os_prio=0 tid=0x00007f9820003000 nid=0x6b38 in Object.wait() [0x00007f982ccbb000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) at java.lang.Thread.join(Thread.java:1252) - locked <0x00000000863b7778> (a winstone.ShutdownHook) at java.lang.Thread.join(Thread.java:1326) at java.lang.ApplicationShutdownHooks.runHooks(ApplicationShutdownHooks.java:106) at java.lang.ApplicationShutdownHooks$1.run(ApplicationShutdownHooks.java:46) at java.lang.Shutdown.runHooks(Shutdown.java:123) at java.lang.Shutdown.sequence(Shutdown.java:167) at java.lang.Shutdown.exit(Shutdown.java:212) - locked <0x0000000086319db8> (a java.lang.Class for java.lang.Shutdown) at java.lang.Terminator$1.handle(Terminator.java:52) at sun.misc.Signal$1.run(Signal.java:212) at java.lang.Thread.run(Thread.java:748) {noformat} So my guess was "... maybe because of running into timeouts when suspending (the more or less still suspended pipelines; because first action in init.d hook scripts is setting Jenkins in shutdown mode)!?" |
New:
As suggested by [~svanoort] -- based upon: # https://issues.jenkins-ci.org/browse/JENKINS-34256?focusedCommentId=337057&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-337057 {quote} I would open a separate issue for Timeouts suspending executions - especially if you can come up with a consistent way to reproduce it. I saw it from time to time with Pipelines doing very complex processing (where we can't block the shutdown forever and shouldn't). My suspicion is that there's a subtle bug around the halt-at-shutdown logic, which may have been pre-existing but is visible now because the process is more closely monitored and logged now (also because we actually have some test coverage for it). Unfortunately {quote} # https://issues.jenkins-ci.org/browse/JENKINS-34256?focusedCommentId=336885&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-336885 Steps to re-produce: # Start scripted pipeline, e.g.: {code} #!/usr/bin/env groovy stage('stage') { timestamps { echo('In stage "stage"...') node { sh 'sleep 90' } echo('After sleeping...') } } {code} # Enter shutdown mode during "sh 'sleep ...'" step # Restart Jenkins # _Now I am not sure if this is relevant or not, but:_ in my case there is an init.d groovy hook script that is setting Jenkins in shutdown mode right at the start-up of Jenkins... #* Not sure if it is sufficient to put it manually in shutdown mode as well, but I could imagine!? # After successful (first) Jenkins restart (i.e. Jenkins logs "Jenkins is fully up and running" and is in shutdown mode) trigger the second restart #* Thread dump before: [^thread-dump-before.txt] #* Thread dump a few seconds after triggering restart ( [^thread-dump-after.txt] ), the interesting threads: {noformat} "Thread-1" #19 prio=5 os_prio=0 tid=0x00007f982804f800 nid=0x6b3c waiting on condition [0x00007f97bdde5000] java.lang.Thread.State: TIMED_WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x00000000fedec7e0> (a com.google.common.util.concurrent.AbstractFuture$Sync) at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037) at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328) at com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:258) at com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:91) at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution.suspendAll(CpsFlowExecution.java:1555) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at hudson.init.TaskMethodFinder.invoke(TaskMethodFinder.java:104) at hudson.init.TaskMethodFinder$TaskImpl.run(TaskMethodFinder.java:175) at org.jvnet.hudson.reactor.Reactor.runTask(Reactor.java:296) at org.jvnet.hudson.reactor.Reactor$2.run(Reactor.java:214) at org.jvnet.hudson.reactor.Reactor$Node.run(Reactor.java:117) at jenkins.model.Jenkins$18.execute(Jenkins.java:3338) at org.jvnet.hudson.reactor.Reactor$Node.runIfPossible(Reactor.java:139) at org.jvnet.hudson.reactor.Reactor$Node.run(Reactor.java:128) - locked <0x00000000e399c808> (a org.jvnet.hudson.reactor.Reactor) at jenkins.model.Jenkins$18.execute(Jenkins.java:3338) at org.jvnet.hudson.reactor.Reactor$Node.runIfPossible(Reactor.java:139) at org.jvnet.hudson.reactor.Reactor.execute(Reactor.java:276) - locked <0x00000000e399c808> (a org.jvnet.hudson.reactor.Reactor) at jenkins.model.Jenkins._cleanUpRunTerminators(Jenkins.java:3335) at jenkins.model.Jenkins.cleanUp(Jenkins.java:3256) at hudson.WebAppMain.contextDestroyed(WebAppMain.java:379) at org.eclipse.jetty.server.handler.ContextHandler.callContextDestroyed(ContextHandler.java:898) at org.eclipse.jetty.servlet.ServletContextHandler.callContextDestroyed(ServletContextHandler.java:545) at org.eclipse.jetty.server.handler.ContextHandler.stopContext(ContextHandler.java:873) at org.eclipse.jetty.servlet.ServletContextHandler.stopContext(ServletContextHandler.java:355) at org.eclipse.jetty.webapp.WebAppContext.stopWebapp(WebAppContext.java:1521) at org.eclipse.jetty.webapp.WebAppContext.stopContext(WebAppContext.java:1485) at org.eclipse.jetty.server.handler.ContextHandler.doStop(ContextHandler.java:927) at org.eclipse.jetty.servlet.ServletContextHandler.doStop(ServletContextHandler.java:271) at org.eclipse.jetty.webapp.WebAppContext.doStop(WebAppContext.java:569) at org.eclipse.jetty.util.component.AbstractLifeCycle.stop(AbstractLifeCycle.java:89) - locked <0x00000000865c0ed0> (a java.lang.Object) at org.eclipse.jetty.util.component.ContainerLifeCycle.stop(ContainerLifeCycle.java:144) at org.eclipse.jetty.util.component.ContainerLifeCycle.doStop(ContainerLifeCycle.java:162) at org.eclipse.jetty.server.handler.AbstractHandler.doStop(AbstractHandler.java:124) at org.eclipse.jetty.server.Server.doStop(Server.java:489) at org.eclipse.jetty.util.component.AbstractLifeCycle.stop(AbstractLifeCycle.java:89) - locked <0x00000000865b0c78> (a java.lang.Object) at winstone.Launcher.shutdown(Launcher.java:307) at winstone.ShutdownHook.run(ShutdownHook.java:25) "SIGTERM handler" #105 daemon prio=9 os_prio=0 tid=0x00007f9820003000 nid=0x6b38 in Object.wait() [0x00007f982ccbb000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) at java.lang.Thread.join(Thread.java:1252) - locked <0x00000000863b7778> (a winstone.ShutdownHook) at java.lang.Thread.join(Thread.java:1326) at java.lang.ApplicationShutdownHooks.runHooks(ApplicationShutdownHooks.java:106) at java.lang.ApplicationShutdownHooks$1.run(ApplicationShutdownHooks.java:46) at java.lang.Shutdown.runHooks(Shutdown.java:123) at java.lang.Shutdown.sequence(Shutdown.java:167) at java.lang.Shutdown.exit(Shutdown.java:212) - locked <0x0000000086319db8> (a java.lang.Class for java.lang.Shutdown) at java.lang.Terminator$1.handle(Terminator.java:52) at sun.misc.Signal$1.run(Signal.java:212) at java.lang.Thread.run(Thread.java:748) {noformat} So my guess was "... maybe because of running into timeouts when suspending (the more or less still suspended pipelines; because first action in init.d hook scripts is setting Jenkins in shutdown mode)!?" |
Status | Original: Open [ 1 ] | New: In Progress [ 3 ] |
Status | Original: In Progress [ 3 ] | New: In Review [ 10005 ] |
Assignee | New: jang hyemi [ miraha ] |
Assignee | Original: jang hyemi [ miraha ] |