-
Bug
-
Resolution: Unresolved
-
Blocker
-
HideJenkins core 2.107.3
workflow-aggregator: 2.5
workflow-api: 2.27
workflow-basic-steps: 2.7
workflow-cps: 2.53
workflow-cps-global-lib: 2.9
workflow-durable-task-step: 2.19
workflow-job: 2.21
workflow-multibranch: 2.19
workflow-scm-step: 2.6
workflow-step-api: 2.15
workflow-support: 2.18
pipeline-build-step: 2.7
pipeline-graph-analysis: 1.6
pipeline-input-step: 2.8
pipeline-milestone-step: 1.3.1
pipeline-model-api: 1.2.9
pipeline-model-declarative-agent: 1.1.1
pipeline-model-definition: 1.2.9
pipeline-model-extensions: 1.2.9
pipeline-rest-api: 2.10
pipeline-stage-step: 2.3
pipeline-stage-tags-metadata: 1.2.9
pipeline-stage-view: 2.10
pipeline-utility-steps: 2.1.0ShowJenkins core 2.107.3 workflow-aggregator: 2.5 workflow-api: 2.27 workflow-basic-steps: 2.7 workflow-cps: 2.53 workflow-cps-global-lib: 2.9 workflow-durable-task-step: 2.19 workflow-job: 2.21 workflow-multibranch: 2.19 workflow-scm-step: 2.6 workflow-step-api: 2.15 workflow-support: 2.18 pipeline-build-step: 2.7 pipeline-graph-analysis: 1.6 pipeline-input-step: 2.8 pipeline-milestone-step: 1.3.1 pipeline-model-api: 1.2.9 pipeline-model-declarative-agent: 1.1.1 pipeline-model-definition: 1.2.9 pipeline-model-extensions: 1.2.9 pipeline-rest-api: 2.10 pipeline-stage-step: 2.3 pipeline-stage-tags-metadata: 1.2.9 pipeline-stage-view: 2.10 pipeline-utility-steps: 2.1.0
-
Powered by SuggestiMate
Three times in the last two weeks, we've had our Jenkins server stop responding to requests. When I check syslog, I see errors like this:
Jun 30 16:07:18 jenkins [jenkins]: Jun 30, 2018 4:07:18 PM org.jenkinsci.plugins.workflow.support.concurrent.Timeout lambda$ping$0
Jun 30 16:07:18 jenkins [jenkins]: INFO: Running CpsFlowExecutionOwner[project/263:project #263] unresponsive for 5 sec
Jun 30 16:07:18 jenkins [jenkins]: Jun 30, 2018 4:07:18 PM org.jenkinsci.plugins.workflow.support.concurrent.Timeout lambda$ping$0
Jun 30 16:07:18 jenkins [jenkins]: INFO: Running CpsFlowExecutionOwner[project/368:project #368] unresponsive for 5 sec
Jun 30 16:07:18 jenkins [jenkins]: Jun 30, 2018 4:07:18 PM org.jenkinsci.plugins.workflow.support.concurrent.Timeout lambda$ping$0
Jun 30 16:07:18 jenkins [jenkins]: INFO: Running CpsFlowExecutionOwner[project/318:project #318] unresponsive for 5 sec
These seem to persist indefinitely and there don't seem to be any other relevant messages in the log. The Web UI just hangs until nginx times out.
The Java process will then refuse to stop when I try to restart the service and I have to kill it with kill -9.
[JENKINS-52362] Jenkins hangs due to "Running CpsFlowExecution unresponsive"
Happening randomly maybe 30% of the time. I'm using Jenkins to build and run spring-boot docker containers. Jenkins is also run in a container.
The message I am getting:
Jul 09, 2018 9:13:40 PM org.jenkinsci.plugins.workflow.support.concurrent.Timeout lambda$ping$0 {code:java} INFO: org.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep [#336]: checking /var/jenkins_home/workspace/tonicdm on unresponsive for 5.1 sec
We are also seeing same issue.
Jenkins ver. 2.121.1
Jul 22, 2018 6:53:27 PM org.jenkinsci.plugins.workflow.support.concurrent.Timeout lambda$ping$0
INFO: Running CpsFlowExecutionOwner[-Project/872:Project #872 unresponsive for 22 hr
Also seeing this issue. It starts looping with the unresponsive time going up, but otherwise no change. It never seems to recover from this state, even after 12+ hours.
Running jenkins from the Docker tag: jenkins/jenkins:lts
Currently at version: 2.121.3
Log output:
Sep 01, 2018 11:42:41 PM com.squareup.okhttp.internal.Platform$JdkWithJettyBootPlatform getSelectedProtocol INFO: ALPN callback dropped: SPDY and HTTP/2 are disabled. Is alpn-boot on the boot class path? Sep 01, 2018 11:43:18 PM org.jenkinsci.plugins.workflow.support.concurrent.Timeout lambda$ping$0 INFO: org.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep [#570]: checking REDACTED on Docker (i-06939e1a358dc4ce5) unresponsive for 5 sec Sep 01, 2018 11:43:27 PM org.jenkinsci.plugins.workflow.support.concurrent.Timeout lambda$ping$0 INFO: org.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep [#570]: checking REDACTED on Docker (i-06939e1a358dc4ce5) unresponsive for 13 sec Sep 01, 2018 11:44:00 PM org.jenkinsci.plugins.workflow.support.concurrent.Timeout lambda$ping$0
I have not been able to reproduce this since adding swap to my jenkins master docker host. I think this condition may somehow be triggered by low memory.
jniedrauer pdouglas Please can you grab and attach a thread dump from when you see this issue?
Similar issue also reproducible here on an older machine. Setup (amongst others):
- Jenkins core 2.60.3
- Yet Another Docker Plugin 0.1.0-rc47
- Jenkins running in docker container, connecting to another docker server for running jobs on the slave.
System locks up, but apparently continues running internally.
The HTTP server can't be down entirely, as sending a GET to the endpoint /api/json (which we use for "availability pinging") kept responding at usual response times.
Jenkins runs jobs, which are executed every 5 minutes, so we can track down the point in time of it quite well.
I could cross-check: There was more than 120MB of free heap memory for the java process + further 3GB of RAM.
Operating system and docker logs around that time seem very unsuspicious.
svanoort I have not been able to reproduce this after weeks of heavy use. It may have been adding swap that fixed it, or it may have been that a new version was pushed to the lts tag. I am not sure. In either case, it is unlikely that I will be able to get you a thread dump.
Please could you attach a thread dump from Jenkins?
... will try to when the server goes down next time.
It might become a little tricky, as this is a productive instance and if it is down, pressure is often high to get it back up running again as soon as possible.
eagle_rainbow Ack, we'll wait untuil you have a chance to grab the thread dump – otherwise it'll depend on if this gets fixed by another bugfix.
Just for the record: At the same server, we experienced a deadlock situation today, which may be related to this issue:
Today, we had a very sluggish server (long latency in response). Checking, we found several hanging inbound GET requests which took ages (>1,100,000 ms) to complete. A thread dump showed that several threads were blocked by a lock in jenkins.model.lazy.AbstractLazyLoadRunMap.getByNumber(AbstractLazyLoadRunMap.java:369), which was indirectly triggered by hudson.plugins.performance.actions.PerformanceProjectAction.doRespondingTimeGraph. Note that there was not just the performance plugin, but we also saw other GET requests, such as /job/.../test/trend (there also locked in jenkins.model.lazy.AbstractLazyLoadRunMap.getByNumber) or /job/.../jacoco/graph (also in jenkins.model.lazy.AbstractLazyLoadRunMap.getByNumber) were affected.
After a little of analysis, we found one of the /job/.../performance/throughputGraph jobs running (state "running"), which apparently was in an endless loop. It also held the lock of the critical monitor, which blocked all the other requests. The interesting (triggering) block within this thread to me was:
... at com.thoughtworks.xstream.XStream.unmarshal(XStream.java:1189) at hudson.util.XStream2.unmarshal(XStream2.java:114) at com.thoughtworks.xstream.XStream.unmarshal(XStream.java:1173) at hudson.XmlFile.unmarshal(XmlFile.java:160) at org.jenkinsci.plugins.workflow.job.WorkflowRun.reload(WorkflowRun.java:603) at hudson.model.Run.<init>(Run.java:325) at org.jenkinsci.plugins.workflow.job.WorkflowRun.<init>(WorkflowRun.java:209) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at jenkins.model.lazy.LazyBuildMixIn.loadBuild(LazyBuildMixIn.java:165) at jenkins.model.lazy.LazyBuildMixIn$1.create(LazyBuildMixIn.java:142)
Killing the thread did the trick - and the rest started to work again. Afterwards, we had to restart the server - but that was due to another problem, which is unrelated to this one here.
However, the situation today was different than before: Today, we had a significant load average / CPU load during that situation. In the previous situation, load average / CPU load was very normal - also for hours before the blocking event.
Given that the symptoms are different, I am currently not sure whether we just saw the "early stage" of yet-another occurrence of this issue, which we could cure with a courageous thread kill, or whether this was something totally different. For sure, it makes sense to closely look at the list of locks pending, if the issue reappears.
Intermediate feedback: So far, the problem did not reappear for our server. The only thing which we have changed (after the issue I had documented in https://issues.jenkins-ci.org/browse/JENKINS-52362?focusedCommentId=349554&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-349554 ) was the "read timeout" setting in the YAD configuration section from "empty" to 120.
I will update this ticket in case we experience yet another situation where the server stops responding.
eagle_rainbow I'm glad to see you haven't had a recurrence. That sounds like you were held up due to I/O issues – only one operation can try to load a build from disk at a time, and if that takes a while and other operations are depending on that build being loaded they will be blocked until it finishes loading. We did some improvements over the last few releases of workflow-cps and workflow-job to support lazy-loading of the FlowExecution which can make these cases significantly faster (especially where we are using the Performance-Optimized durability mode).
Looking at performance-plugin, it seems that the Performance plugin causes this due to trying to load all builds for each run (which is bound to create a bottleneck for other things trying to access the builds as well).
Now, what I'd love to see is the full stack trace for that "endless loop" – it's possible in some rare cases if the build onLoad ends up invoking itself either directly or indirectly, but suggests a critical bug somewhere (perhaps Performance Plugin, perhaps a combination of plugins).
Also, what is "YAD plugin" short for? "Yet Another Docker plugin"? If so, I'm very curious how that could be related because the timeout there applies to communications with the Docker server.
We did some improvements over the last few releases of workflow-cps and workflow-job
That sounds promising. We'll have a look at this some time later.
Now, what I'd love to see is the full stack trace for that "endless loop".
Before going into detail here, please let me reiterate one thing: It is not proven to me that the "endless loop" case really was an "early version" of the original bug report here. It has happened on the same server at roughly the same time. Having said this, let's have a look at the thread trace.
I have attached an anonymized version of the thread trace I had created when the system was in that "endless loop" state (20180919-hangingjenkinsthreads-logs.txt). I suggest to start your analysis with searching for the term "main-voter", which is one of our jobs - and based on my analysis is the job, which caused the situation.
Although we enabled quite strict retention on that job, we still have ~250 builds with it. Moreover, expect that each (successful) build will have around 1600 (mostly very small) log files in the build's folder (BTW they give us also a hard time with our backup strategy).
Also, what is "YAD plugin" short for? "Yet Another Docker plugin"?
Yes, correct.
If so, I'm very curious how that could be related because the timeout there applies to communications with the Docker server.
Well, this is yet another guess of us - here's the story:
Remember that I had written
Afterwards, we had to restart the server
The reason for that was the thread "jenkins.util.Timer 6". If you look at its stack, you'll see that it's blocked in a ListContainersCmdExec request. That's a call via HTTP REST to the docker server, asking for the list of all containers running on the host (mainly - it's a little more complicated than this ). With an additional tool, we found out that it must be hanging there for hours (so much on "read timeout" - empty setting there means "infinity"). It's in blocked I/O state, waiting for the result coming back from the docker host.
We don't exactly know what the docker host did (replied?), but usually such calls only take 1-2 seconds to answer - on very busy hosts it may be up to half a minute or so. You may expect that our docker host should respond within less of a second. Apparently, the missing response had blocked the YAD plugin and no further containers could be created (which mainly meant that the build queue was blocked, as nearly all our jobs require a new container). We could observe that this also had a negative effect on the management of already running containers/nodes for currently-running jobs (containers were hanging strangely). It wouldn't surprise me, if that also had indirect bad influence on many other things...
We also tried to kill that thread (I know that the YAD plugin checks the state of the containers on the host regularly), but as it was in state "blocked I/O" the JVM wouldn't allow us to do that. In the end, the only chance to get this fix again was to restart Jenkins. We then set the read timeout hoping to prevent by this that it does not wait for eternity anymore.
But again: a big bunch of the story here is guess work, which we put together like a mosaic based on some small pieces which we saw. That's why nobody should take this for "granted knowledge".
We encountered the same issue (Running Jenkins ver. 2.60.3). I also attached out thread-dump fyi. Restarting the master node solved the issue (for now).
Our Jenkins is deployed to Kubernetes but we don't have Yet Another Docker plugin installed.
Short update from our case:
- Yesterday we had another case where we almost stumbled into this situation again.
- Situation was that one user had caused many requests to the main page of a job (called "JenkinsJob102" later on) where the complex graphs (jacoco/test coverage/...) are rendered. We again stumbled into the situation that the HTTP request threads were hanging for a long time.
- CPU load was up to 500% (i.e. 5 cores were busy). Note that still some CPU capacity was free.
- I/O load apparently was not the biggest problem (otherwise we would have seen a different setting on "top").
- Jenkins stopped job processing. HTTP response time was in the area of 30s and longer.
- We did not see the error log message "INFO: Running CpsFlowExecutionOwner[...] unresponsive for 5 sec" (or similar) yet, but the execution of Pipeline jobs seized. So I would have expected that this was to happen very soon.
I attached a thread dump for you to this ticket (20181012-statebefore.txt). We detected two culprits:
- Lock 0x00000000d7102070 was the culprit that the GET requests started to queue again (search for "#3231" in the thread dump file). All hanging GET HTTP requests threads were against JenkinsJob102.
We first killed the thread which has "#3231" in its name, as it was the current owner of the lock. CPU shortly dropped, but all the rest of the other threads kicked in. We then manually killed also the rest of the threads, as we were very confident that these requests were leftovers which no user ever would require anymore. That took roughly 15 minutes to be done, as performance of Jenkins was bad.
Once they were gone, CPU load was at around our usual 20%. Yet, the job queue was not processing anymore. - Taking another thread dump snapshot (which unfortunately I lost shortly thereafter), we then detected that Yet-Another-Dockerplugin (YAD) was waiting for a response from our docker server again. It had a lock on the method "getClient()" and thus other threads for provisioning new slaves could not gain the lock (nearly all our jobs in the queue require a docker-based slave in one or the other way). Having cross-checked with the docker server (which was not expecting to send anything anymore), we then also killed that thread which was waiting for the answer from the docker server, which would never come.
With that also provisioning of slaves resumed and the job queue started to reduce.
Having the Jenkins server in good shape again, we dared to try reproducing the situation: One user logged on and opened four browser tabs pointing to the main page of job "JenkinsJob102". He then did a single "browser refresh (F5)" for each of these tabs. CPU load then almost immediately was up to 500% again and we had roughly a dozen of hanging GET request threads (note though that the number of "hanging threads" was much lower than I would have expected them to be based on the attempts the user made - so "some of the requests" must have been completed in virtually no time). Response time was again as bad as we had experienced it before. The stack traces also looked similar.
We killed the HTTP GET request threads again; this time we did not run into a YAD lock problem, though. System was then up well thereafter.
In the builds folder of job "JenkinsJob102", we currently have 316 builds.
The moral of this story to me is that we have two distinct locking issues, which could build up each other.
What I cannot explain properly yet, though, is the question why sometimes the "heavy GET requests" can be answered in virtually no time (obviously they aren't cached...), but as soon as the number of requests to the resource goes up, each execution takes much more CPU capacity (excluding wait time) than before.
After a longer period of not having experienced the problem, we just encountered it again. I could take a jstack snapshot and I found the following interesting thread stack trace:
"Running CpsFlowExecution[Owner[apppname1/appname2/appname3/appname4/134675:appname1/appname2/appname3/appname4 #134675]]" #164285 daemon prio=5 os_prio=0 tid=0x00007f668d046000 nid=0x2b16 waiting on condition [0x00007f65db018000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x00000000d6edcc30> (a org.codehaus.groovy.reflection.GroovyClassValuePreJava7$GroovyClassValuePreJava7Segment) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199) at org.codehaus.groovy.util.LockableObject.lock(LockableObject.java:37) at org.codehaus.groovy.util.AbstractConcurrentMapBase$Segment.removeEntry(AbstractConcurrentMapBase.java:173) at org.codehaus.groovy.util.ManagedConcurrentMap$Entry.finalizeReference(ManagedConcurrentMap.java:81) at org.codehaus.groovy.util.ManagedConcurrentMap$EntryWithValue.finalizeReference(ManagedConcurrentMap.java:115) at org.codehaus.groovy.reflection.GroovyClassValuePreJava7$EntryWithValue.finalizeReference(GroovyClassValuePreJava7.java:51) at org.codehaus.groovy.util.ReferenceManager$CallBackedManager.removeStallEntries0(ReferenceManager.java:108) at org.codehaus.groovy.util.ReferenceManager$CallBackedManager.removeStallEntries(ReferenceManager.java:93) at org.codehaus.groovy.util.ReferenceManager$CallBackedManager.afterReferenceCreation(ReferenceManager.java:117) at org.codehaus.groovy.util.ReferenceManager$1.afterReferenceCreation(ReferenceManager.java:135) at org.codehaus.groovy.util.ManagedReference.<init>(ManagedReference.java:36) at org.codehaus.groovy.util.ManagedReference.<init>(ManagedReference.java:40) at org.codehaus.groovy.util.ManagedLinkedList$Element.<init>(ManagedLinkedList.java:40) at org.codehaus.groovy.util.ManagedLinkedList.add(ManagedLinkedList.java:102) at org.codehaus.groovy.reflection.ClassInfo$GlobalClassSet.add(ClassInfo.java:478) - locked <0x00000000d6e6aa68> (a org.codehaus.groovy.util.ManagedLinkedList) at org.codehaus.groovy.reflection.ClassInfo$1.computeValue(ClassInfo.java:83) at org.codehaus.groovy.reflection.ClassInfo$1.computeValue(ClassInfo.java:79) at org.codehaus.groovy.reflection.GroovyClassValuePreJava7$EntryWithValue.<init>(GroovyClassValuePreJava7.java:37) at org.codehaus.groovy.reflection.GroovyClassValuePreJava7$GroovyClassValuePreJava7Segment.createEntry(GroovyClassValuePreJava7.java:64) at org.codehaus.groovy.reflection.GroovyClassValuePreJava7$GroovyClassValuePreJava7Segment.createEntry(GroovyClassValuePreJava7.java:55) at org.codehaus.groovy.util.AbstractConcurrentMap$Segment.put(AbstractConcurrentMap.java:157) at org.codehaus.groovy.util.AbstractConcurrentMap$Segment.getOrPut(AbstractConcurrentMap.java:100) at org.codehaus.groovy.util.AbstractConcurrentMap.getOrPut(AbstractConcurrentMap.java:38) at org.codehaus.groovy.reflection.GroovyClassValuePreJava7.get(GroovyClassValuePreJava7.java:94) at org.codehaus.groovy.reflection.ClassInfo.getClassInfo(ClassInfo.java:144) at org.codehaus.groovy.runtime.metaclass.MetaClassRegistryImpl.getMetaClass(MetaClassRegistryImpl.java:258) at org.codehaus.groovy.runtime.InvokerHelper.getMetaClass(InvokerHelper.java:883) at org.codehaus.groovy.runtime.callsite.CallSiteArray.createCallStaticSite(CallSiteArray.java:75) at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCallStatic(CallSiteArray.java:56) at org.codehaus.groovy.runtime.callsite.AbstractCallSite.callStatic(AbstractCallSite.java:194) at org.codehaus.groovy.runtime.callsite.AbstractCallSite.callStatic(AbstractCallSite.java:198) at dpGetOnboardingCredentials.<clinit>(dpGetOnboardingCredentials.groovy) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at java.lang.Class.newInstance(Class.java:442) at org.jenkinsci.plugins.workflow.cps.global.UserDefinedGlobalVariable.getValue(UserDefinedGlobalVariable.java:54) at org.jenkinsci.plugins.workflow.cps.CpsScript.invokeMethod(CpsScript.java:99) at sun.reflect.GeneratedMethodAccessor220.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:93) at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:325) at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1218) at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1027) at org.codehaus.groovy.runtime.callsite.PogoMetaClassSite.call(PogoMetaClassSite.java:42) at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:48) at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:113) at org.kohsuke.groovy.sandbox.impl.Checker$1.call(Checker.java:157) at org.kohsuke.groovy.sandbox.GroovyInterceptor.onMethodCall(GroovyInterceptor.java:23) at org.jenkinsci.plugins.scriptsecurity.sandbox.groovy.SandboxInterceptor.onMethodCall(SandboxInterceptor.java:133) at org.kohsuke.groovy.sandbox.impl.Checker$1.call(Checker.java:155) at org.kohsuke.groovy.sandbox.impl.Checker.checkedCall(Checker.java:159) at org.kohsuke.groovy.sandbox.impl.Checker.checkedCall(Checker.java:129) at org.kohsuke.groovy.sandbox.impl.Checker.checkedCall(Checker.java:129) at com.cloudbees.groovy.cps.sandbox.SandboxInvoker.methodCall(SandboxInvoker.java:17) at com.cloudbees.groovy.cps.impl.ContinuationGroup.methodCall(ContinuationGroup.java:57) at com.cloudbees.groovy.cps.impl.FunctionCallBlock$ContinuationImpl.dispatchOrArg(FunctionCallBlock.java:109) at com.cloudbees.groovy.cps.impl.FunctionCallBlock$ContinuationImpl.fixArg(FunctionCallBlock.java:82) at sun.reflect.GeneratedMethodAccessor179.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at com.cloudbees.groovy.cps.impl.ContinuationPtr$ContinuationImpl.receive(ContinuationPtr.java:72) at com.cloudbees.groovy.cps.impl.CollectionLiteralBlock$ContinuationImpl.dispatch(CollectionLiteralBlock.java:55) at com.cloudbees.groovy.cps.impl.CollectionLiteralBlock$ContinuationImpl.item(CollectionLiteralBlock.java:45) at sun.reflect.GeneratedMethodAccessor188.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at com.cloudbees.groovy.cps.impl.ContinuationPtr$ContinuationImpl.receive(ContinuationPtr.java:72) at com.cloudbees.groovy.cps.impl.ConstantBlock.eval(ConstantBlock.java:21) at com.cloudbees.groovy.cps.Next.step(Next.java:83) at com.cloudbees.groovy.cps.Continuable$1.call(Continuable.java:174) at com.cloudbees.groovy.cps.Continuable$1.call(Continuable.java:163) at org.codehaus.groovy.runtime.GroovyCategorySupport$ThreadCategoryInfo.use(GroovyCategorySupport.java:122) at org.codehaus.groovy.runtime.GroovyCategorySupport.use(GroovyCategorySupport.java:261) at com.cloudbees.groovy.cps.Continuable.run0(Continuable.java:163) at org.jenkinsci.plugins.workflow.cps.SandboxContinuable.access$001(SandboxContinuable.java:19) at org.jenkinsci.plugins.workflow.cps.SandboxContinuable$1.call(SandboxContinuable.java:35) at org.jenkinsci.plugins.workflow.cps.SandboxContinuable$1.call(SandboxContinuable.java:32) at org.jenkinsci.plugins.scriptsecurity.sandbox.groovy.GroovySandbox.runInSandbox(GroovySandbox.java:108) at org.jenkinsci.plugins.workflow.cps.SandboxContinuable.run0(SandboxContinuable.java:32) at org.jenkinsci.plugins.workflow.cps.CpsThread.runNextChunk(CpsThread.java:174) at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.run(CpsThreadGroup.java:331) at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.access$200(CpsThreadGroup.java:82) at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:243) at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:231) at org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService$2.call(CpsVmExecutorService.java:64) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at hudson.remoting.SingleLaneExecutorService$1.run(SingleLaneExecutorService.java:112) at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)
The interesting part is that I also cannot find any job holding the lock - this for more than 5 minutes.
I "killed" the thread using the monitoring plugin. Very soon thereafter, the next job ran into the same problem, which reads:
"Executor #-1 for master : executing appname5/appname6 #105" #168783 daemon prio=5 os_prio=0 tid=0x00007f668d137800 nid=0x49df waiting on condition [0x00007f65dd9d0000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x00000000d6ed59e0> (a org.codehaus.groovy.reflection.GroovyClassValuePreJava7$GroovyClassValuePreJava7Segment) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199) at org.codehaus.groovy.util.LockableObject.lock(LockableObject.java:37) at org.codehaus.groovy.util.AbstractConcurrentMap$Segment.put(AbstractConcurrentMap.java:104) at org.codehaus.groovy.util.AbstractConcurrentMap$Segment.getOrPut(AbstractConcurrentMap.java:100) at org.codehaus.groovy.util.AbstractConcurrentMap.getOrPut(AbstractConcurrentMap.java:38) at org.codehaus.groovy.reflection.GroovyClassValuePreJava7.get(GroovyClassValuePreJava7.java:94) at org.codehaus.groovy.reflection.ClassInfo.getClassInfo(ClassInfo.java:144) at org.codehaus.groovy.runtime.metaclass.MetaClassRegistryImpl.getMetaClass(MetaClassRegistryImpl.java:258) at org.codehaus.groovy.runtime.InvokerHelper.getMetaClass(InvokerHelper.java:883) at org.codehaus.groovy.runtime.callsite.CallSiteArray.createCallStaticSite(CallSiteArray.java:75) at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCallStatic(CallSiteArray.java:56) at org.codehaus.groovy.runtime.callsite.AbstractCallSite.callStatic(AbstractCallSite.java:194) at org.codehaus.groovy.runtime.callsite.AbstractCallSite.callStatic(AbstractCallSite.java:198) at WorkflowScript.<clinit>(WorkflowScript) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at java.lang.Class.newInstance(Class.java:442) at org.codehaus.groovy.runtime.InvokerHelper.createScript(InvokerHelper.java:434) at groovy.lang.GroovyShell.parse(GroovyShell.java:700) at org.jenkinsci.plugins.workflow.cps.CpsGroovyShell.doParse(CpsGroovyShell.java:133) at org.jenkinsci.plugins.workflow.cps.CpsGroovyShell.reparse(CpsGroovyShell.java:127) at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution.parseScript(CpsFlowExecution.java:557) at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution.start(CpsFlowExecution.java:518) at org.jenkinsci.plugins.workflow.job.WorkflowRun.run(WorkflowRun.java:269) at hudson.model.ResourceController.execute(ResourceController.java:97) at hudson.model.Executor.run(Executor.java:405)
The jobs are in a very early state. The second job, for example, only has the following lines in the console output:
Started by user [User name censored]
Does this say anything to anyone already?
downgrade to jenkins-2.140 helped me, I don't see any issues anymore.
btw, the issue exists on 2.138.3 and on 2.151 versions.
Latest LTS version 2.150.1 also has it.
This really really annoying. We have to restart the complete server few times a day.
We are experiencing the same, and also have to reboot our server several times per day.
Then sometimes it goes for days without a problem.
I filed this bug - https://issues.jenkins-ci.org/browse/JENKINS-54894
We are going to try creating a new server to resolve this since we also have 2 other servers, configured nearly the same as the one we keep having to reboot. The new one will be identical to the 2 we don't have a problem on.
I took a thread dump during our encounter of this issue in the log today. We seem to encounter a Deadlock with Ec2 provisioning - possibly related to the changes introduced to spawn instances without delay. Have others enabled this feature?
Our symptoms: Jenkins queue will grow (not substantially) - jobs don't seem to execute tho we have available executors. New jobs fail to even launch - nothing in logs either. All intensive purposes we don't get any lightweight executors. We see the following log entry multiple times for jobs that were in progress.
Jan 02, 2019 10:02:36 PM org.jenkinsci.plugins.workflow.support.concurrent.Timeout lambda$ping$0 INFO: Running CpsFlowExecution[Owner[<Omitted>/master #59]] unresponsive for 55 min
hudson.plugins.ec2.AmazonEC2Cloud is blocking 3 threads.
jenkins.util.Timer [#7] thread obtained hudson.plugins.ec2.AmazonEC2Cloud's lock & did not release it. Due to that 3 threads are BLOCKED as shown in the below graph. If threads are BLOCKED for prolonged period, application will become unresponsive. Examine 'jenkins.util.Timer [#7]' stacktrace to see why lock is not released.
jenkins.util.Timer [#7] - priority:5 - threadId:0x00007f21a00f9800 - nativeId:0x1e3ac - nativeId (decimal):123820 - state:WAITING stackTrace: java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x00000000c06ca768> (a java.util.concurrent.locks.ReentrantLock$NonfairSync) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199) at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209) at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285) at hudson.model.Queue._withLock(Queue.java:1438) at hudson.model.Queue.withLock(Queue.java:1301) at jenkins.model.Nodes.updateNode(Nodes.java:193) at jenkins.model.Jenkins.updateNode(Jenkins.java:2095) at hudson.model.Node.save(Node.java:140) at hudson.util.PersistedList.onModified(PersistedList.java:173) at hudson.util.PersistedList.replaceBy(PersistedList.java:85) at hudson.model.Slave.<init>(Slave.java:198) at hudson.plugins.ec2.EC2AbstractSlave.<init>(EC2AbstractSlave.java:138) at hudson.plugins.ec2.EC2OndemandSlave.<init>(EC2OndemandSlave.java:49) at hudson.plugins.ec2.EC2OndemandSlave.<init>(EC2OndemandSlave.java:42) at hudson.plugins.ec2.SlaveTemplate.newOndemandSlave(SlaveTemplate.java:963) at hudson.plugins.ec2.SlaveTemplate.toSlaves(SlaveTemplate.java:660) at hudson.plugins.ec2.SlaveTemplate.provisionOndemand(SlaveTemplate.java:632) at hudson.plugins.ec2.SlaveTemplate.provision(SlaveTemplate.java:463) at hudson.plugins.ec2.EC2Cloud.getNewOrExistingAvailableSlave(EC2Cloud.java:587) - locked <0x00000000d5800000> (a hudson.plugins.ec2.AmazonEC2Cloud) at hudson.plugins.ec2.EC2Cloud.provision(EC2Cloud.java:602) at hudson.plugins.ec2.NoDelayProvisionerStrategy.apply(NoDelayProvisionerStrategy.java:48) at hudson.slaves.NodeProvisioner.update(NodeProvisioner.java:320) at hudson.slaves.NodeProvisioner.access$000(NodeProvisioner.java:61) at hudson.slaves.NodeProvisioner$NodeProvisionerInvoker.doRun(NodeProvisioner.java:809) at hudson.triggers.SafeTimerTask.run(SafeTimerTask.java:72) at jenkins.security.ImpersonatingScheduledExecutorService$1.run(ImpersonatingScheduledExecutorService.java:58) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)
Handling POST /cloud/ec2-us-east-1/provision from <omitted> : <omitted> Stack Trace is: java.lang.Thread.State: BLOCKED (on object monitor) at hudson.plugins.ec2.EC2Cloud.getNewOrExistingAvailableSlave(EC2Cloud.java:568) - waiting to lock <0x00000000d5800000> (a hudson.plugins.ec2.AmazonEC2Cloud) at hudson.plugins.ec2.EC2Cloud.doProvision(EC2Cloud.java:358) at java.lang.invoke.LambdaForm$DMH/565760380.invokeVirtual_LL_L(LambdaForm$DMH) at java.lang.invoke.LambdaForm$BMH/222162113.reinvoke(LambdaForm$BMH) at java.lang.invoke.LambdaForm$MH/775458576.invoker(LambdaForm$MH) at java.lang.invoke.LambdaForm$MH/2100864823.invokeExact_MT(LambdaForm$MH) at java.lang.invoke.MethodHandle.invokeWithArguments(MethodHandle.java:627) at org.kohsuke.stapler.Function$MethodFunction.invoke(Function.java:396) at org.kohsuke.stapler.Function$InstanceFunction.invoke(Function.java:408) at org.kohsuke.stapler.Function.bindAndInvoke(Function.java:212) at org.kohsuke.stapler.Function.bindAndInvokeAndServeResponse(Function.java:145) at org.kohsuke.stapler.MetaClass$11.doDispatch(MetaClass.java:537) at org.kohsuke.stapler.NameBasedDispatcher.dispatch(NameBasedDispatcher.java:58) at org.kohsuke.stapler.Stapler.tryInvoke(Stapler.java:739) at org.kohsuke.stapler.Stapler.invoke(Stapler.java:870) at org.kohsuke.stapler.MetaClass$4.doDispatch(MetaClass.java:282) at org.kohsuke.stapler.NameBasedDispatcher.dispatch(NameBasedDispatcher.java:58) at org.kohsuke.stapler.Stapler.tryInvoke(Stapler.java:739) at org.kohsuke.stapler.Stapler.invoke(Stapler.java:870) at org.kohsuke.stapler.Stapler.invoke(Stapler.java:668) at org.kohsuke.stapler.Stapler.service(Stapler.java:238) at javax.servlet.http.HttpServlet.service(HttpServlet.java:790) at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:865) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1655) at hudson.util.PluginServletFilter$1.doFilter(PluginServletFilter.java:154) at org.jenkinsci.plugins.ssegateway.Endpoint$SSEListenChannelFilter.doFilter(Endpoint.java:243) at hudson.util.PluginServletFilter$1.doFilter(PluginServletFilter.java:151) at com.cloudbees.jenkins.support.slowrequest.SlowRequestFilter.doFilter(SlowRequestFilter.java:37) at hudson.util.PluginServletFilter$1.doFilter(PluginServletFilter.java:151) at io.jenkins.blueocean.auth.jwt.impl.JwtAuthenticationFilter.doFilter(JwtAuthenticationFilter.java:61) at hudson.util.PluginServletFilter$1.doFilter(PluginServletFilter.java:151) at com.smartcodeltd.jenkinsci.plugin.assetbundler.filters.LessCSS.doFilter(LessCSS.java:47) at hudson.util.PluginServletFilter$1.doFilter(PluginServletFilter.java:151) at io.jenkins.blueocean.ResourceCacheControl.doFilter(ResourceCacheControl.java:134) at hudson.util.PluginServletFilter$1.doFilter(PluginServletFilter.java:151) at hudson.plugins.greenballs.GreenBallFilter.doFilter(GreenBallFilter.java:59) at hudson.util.PluginServletFilter$1.doFilter(PluginServletFilter.java:151) at net.bull.javamelody.MonitoringFilter.doFilter(MonitoringFilter.java:239) at net.bull.javamelody.MonitoringFilter.doFilter(MonitoringFilter.java:215) at net.bull.javamelody.PluginMonitoringFilter.doFilter(PluginMonitoringFilter.java:88) at org.jvnet.hudson.plugins.monitoring.HudsonMonitoringFilter.doFilter(HudsonMonitoringFilter.java:114) at hudson.util.PluginServletFilter$1.doFilter(PluginServletFilter.java:151) at jenkins.metrics.impl.MetricsFilter.doFilter(MetricsFilter.java:125) at hudson.util.PluginServletFilter$1.doFilter(PluginServletFilter.java:151) at jenkins.telemetry.impl.UserLanguages$AcceptLanguageFilter.doFilter(UserLanguages.java:128) at hudson.util.PluginServletFilter$1.doFilter(PluginServletFilter.java:151) at hudson.util.PluginServletFilter.doFilter(PluginServletFilter.java:157) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1642) at hudson.security.csrf.CrumbFilter.doFilter(CrumbFilter.java:64) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1642) at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:84) at hudson.security.UnwrapSecurityExceptionFilter.doFilter(UnwrapSecurityExceptionFilter.java:51) at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87) at jenkins.security.ExceptionTranslationFilter.doFilter(ExceptionTranslationFilter.java:117) at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87) at org.acegisecurity.providers.anonymous.AnonymousProcessingFilter.doFilter(AnonymousProcessingFilter.java:125) at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87) at org.acegisecurity.ui.rememberme.RememberMeProcessingFilter.doFilter(RememberMeProcessingFilter.java:142) at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87) at org.acegisecurity.ui.AbstractProcessingFilter.doFilter(AbstractProcessingFilter.java:271) at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87) at jenkins.security.BasicHeaderProcessor.doFilter(BasicHeaderProcessor.java:93) at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87) at org.acegisecurity.context.HttpSessionContextIntegrationFilter.doFilter(HttpSessionContextIntegrationFilter.java:249) at hudson.security.HttpSessionContextIntegrationFilter2.doFilter(HttpSessionContextIntegrationFilter2.java:67) at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87) at hudson.security.ChainedServletFilter.doFilter(ChainedServletFilter.java:90) at hudson.security.HudsonFilter.doFilter(HudsonFilter.java:171) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1642) at org.kohsuke.stapler.compression.CompressionFilter.doFilter(CompressionFilter.java:49) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1642) at hudson.util.CharacterEncodingFilter.doFilter(CharacterEncodingFilter.java:82) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1642) at org.kohsuke.stapler.DiagnosticThreadNameFilter.doFilter(DiagnosticThreadNameFilter.java:30) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1642) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:146) at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:524) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132) at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:257) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1595) at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:255) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1340) at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:203) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1564) at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:201) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1242) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:144) at org.eclipse.jetty.server.handler.RequestLogHandler.handle(RequestLogHandler.java:56) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132) at org.eclipse.jetty.server.Server.handle(Server.java:503) at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:364) at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:260) at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:305) at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103) at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:118) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:333) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:310) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:168) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:126) at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:366) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:765) at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:683) at java.lang.Thread.run(Thread.java:748)
EC2 alive slaves monitor thread Stack Trace is: java.lang.Thread.State: BLOCKED (on object monitor) at hudson.plugins.ec2.EC2Cloud.connect(EC2Cloud.java:748) - waiting to lock <0x00000000d5800000> (a hudson.plugins.ec2.AmazonEC2Cloud) at hudson.plugins.ec2.CloudHelper.getInstance(CloudHelper.java:47) at hudson.plugins.ec2.EC2AbstractSlave.fetchLiveInstanceData(EC2AbstractSlave.java:475) at hudson.plugins.ec2.EC2AbstractSlave.isAlive(EC2AbstractSlave.java:443) at hudson.plugins.ec2.EC2SlaveMonitor.execute(EC2SlaveMonitor.java:43) at hudson.model.AsyncPeriodicWork$1.run(AsyncPeriodicWork.java:101) at java.lang.Thread.run(Thread.java:748)
jenkins.util.Timer [#10] Stack Trace is: java.lang.Thread.State: BLOCKED (on object monitor) at hudson.plugins.ec2.EC2Cloud.connect(EC2Cloud.java:748) - waiting to lock <0x00000000d5800000> (a hudson.plugins.ec2.AmazonEC2Cloud) at hudson.plugins.ec2.CloudHelper.getInstance(CloudHelper.java:47) at hudson.plugins.ec2.CloudHelper.getInstanceWithRetry(CloudHelper.java:25) at hudson.plugins.ec2.EC2Computer.getState(EC2Computer.java:127) at hudson.plugins.ec2.EC2RetentionStrategy.internalCheck(EC2RetentionStrategy.java:112) at hudson.plugins.ec2.EC2RetentionStrategy.check(EC2RetentionStrategy.java:90) at hudson.plugins.ec2.EC2RetentionStrategy.check(EC2RetentionStrategy.java:48) at hudson.slaves.ComputerRetentionWork$1.run(ComputerRetentionWork.java:72) at hudson.model.Queue._withLock(Queue.java:1381) at hudson.model.Queue.withLock(Queue.java:1258) at hudson.slaves.ComputerRetentionWork.doRun(ComputerRetentionWork.java:63) at hudson.triggers.SafeTimerTask.run(SafeTimerTask.java:72) at jenkins.security.ImpersonatingScheduledExecutorService$1.run(ImpersonatingScheduledExecutorService.java:58) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)
daxroc, we had similar issues before - and I fear that this is only loosely related to this issue here. In my case, we are not using EC2 cloud but the docker cloud plugin (YAD). Root cause for us was that the plugin did not have a connection timeout configured. If then a connection to the cloud manager fails, the thread is waiting eternally for an answer. For the sake of consistency, however, the thread aquired a lock, which then is never released... and all the blues started...
That is why I would suggest you to have a look at your timeout values (I don't know whether they are configurable in the case of EC2) - and if applicable post them here for further cross-checking. If they are too high, you should fix that first.
Our timeouts are set quite high, I'll reduce those today. It looks like this is known issue in the Ec2 Plugin - JENKINS-53858
Also running into this issue. As others have noted restarting Jenkins resolves the issue short term, but after a couple days uptime the problem resurfaces. Adding the versions of our affected environment.
Jenkins Core: 2.289.1
workflow-aggregator: 2.6
workflow-api: 2.45
workflow-basic-steps: 2.23
workflow-cps: 2.92
workflow-cps-global-lib: 2.19
workflow-durable-task-step: 2.39
workflow-job: 2.41
workflow-multibranch: 2.26
workflow-scm-step: 2.13
workflow-step-api: 2.23
workflow-support: 3.8
pipeline-build-step: 2.13
pipeline-graph-analysis: 1.11
pipeline-input-step: 2.12
pipeline-milestone-step: 1.3.2
pipeline-model-api: 1.8.5
pipeline-model-declarative-agent: N/A
pipeline-model-definition: 1.8.5
pipeline-model-extensions: 1.8.5
pipeline-rest-api: 2.19
pipeline-stage-step: 2.5
pipeline-stage-tags-metadata: 1.8.5
pipeline-stage-view: 2.19
pipeline-utility-steps: 2.8.0
Unfortunately, this issue exists in 2.3.0.3 as well.
We are now relying on restarting the Jenkins between 4-8 hours.
Killing the threads will be quickly loaded with another deadlock, and this goes on until it is restarted.
We have also observed that this deadlock will not appear for FREE-STYLE jobs.
This appears only when we have PIPELINES running at around 25 to 40 concurrent JOBS.
We have 72 Gigs of RAM, 32-CORE CPU, and fast lane disk at 15K RPM, but nothing is saving us other than just a restart.
It's been 15-days and we are struggling to keep the Jenkins up for at least 12 continuous hours to facilitate long-run needed builds but hitting deadlock.
So, extremely painful, badly hurt and now we are migrating long-run jobs to individual Jenkins.
These are the the documentation and guide I tried.
- https://www.jenkins.io/blog/2016/11/21/gc-tuning/
- https://www.cloudbees.com/blog/joining-big-leagues-tuning-jenkins-gc-responsiveness-and-stability
- https://www.oracle.com/java/technologies/javase/vmoptions-jsp.html
- https://wiki.jenkins.io/JENKINS/Consideration-for-Large-Scale-Jenkins-Deployment.html
- https://www.jenkins.io/blog/2016/11/21/gc-tuning/
- https://issues.jenkins.io/browse/JENKINS-29664
- https://issues.jenkins.io/browse/JENKINS-59850
- https://issues.jenkins.io/browse/JENKINS-62222
2021-11-30 16:44:06.415+0000 [id=87538] INFO o.j.p.s.s.g.SandboxResolvingClassLoader#lambda$load$2: took 5,736ms to load/not load java.io.Serializable$groovy$lang$com$cloudbees$groovy$cps$env$HTTPS_PROXY from classLoader hudson.PluginManager$UberClassLoader
2021-11-30 16:44:06.413+0000 [id=1596] INFO o.j.p.w.s.concurrent.Timeout#lambda$ping$0: Running CpsFlowExecutionOwner[SEPG/VTY/VTY~vm-provisioning~git/216:SEPG/VTY/VTY~vm-provisioning~git #216]] loading java.io.Serializable$com$cloudbees$groovy$cps$hudson$model$Hudson unresponsive for 13 min
2021-11-30 16:45:27.977+0000 [id=87816] INFO o.j.p.s.s.g.SandboxResolvingClassLoader#lambda$load$2: took 57,825ms to load/not load java/util/groovy/json/env$SEPG_OVERRIDE_NO_PROXY.groovy from classLoader hudson.PluginManager$UberClassLoader
2021-11-30 16:44:59.173+0000 [id=87338] INFO o.j.p.s.s.g.SandboxResolvingClassLoader#lambda$load$2: took 5,805ms to load/not load java/io/Serializable$com$cloudbees$groovy$cps$hudson$model$Hudson.groovy from classLoader hudson.PluginManager$UberClassLoader
2021-11-30 16:45:33.645+0000 [id=87801] INFO o.j.p.s.s.g.SandboxResolvingClassLoader#lambda$load$2: took 5,664ms to load/not load groovy.lang.GroovyObject$java$net$jenkins$env$JOB_NAME from classLoader hudson.PluginManager$UberClassLoader
2021-11-30 16:45:27.980+0000 [id=1596] INFO o.j.p.w.s.concurrent.Timeout#lambda$ping$0: Running CpsFlowExecutionOwner[SEPG/IDSolutions/IDSolutions~eurus-ui~ISOL-883~CI/1:SEPG/IDLSolutions/IDLSolutions~eurus-ui~ISOL-883~CI #1]] unresponsive for 12 min
2021-11-30 16:45:39.405+0000 [id=87338] INFO o.j.p.s.s.g.SandboxResolvingClassLoader#lambda$load$2: took 11,423ms to load/not load groovy.lang.GroovyObject$com$cloudbees$groovy$cps$hudson$model$Hudson from classLoader hudson.PluginManager$UberClassLoader
2021-11-30 16:45:39.405+0000 [id=1596] INFO o.j.p.w.s.concurrent.Timeout#lambda$ping$0: Running CpsFlowExecutionOwner[SEPG/VTY/VTY~ftp~master~CI/1012:SEPG/VTY/VTY~ftp~master~CI #1012]] unresponsive for 13 min
2021-11-30 16:45:39.405+0000 [id=1596] INFO o.j.p.w.s.concurrent.Timeout#lambda$ping$0: org.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep 447: checking /tmp/ws on sepg-nix03-DH unresponsive for 13 min
2021-11-30 16:45:39.405+0000 [id=1596] INFO o.j.p.w.s.concurrent.Timeout#lambda$ping$0: Running CpsFlowExecutionOwner[SEPG/sepg/sepg~sample-dotnet-6~git/5:SEPG/sepg/sepg~sample-dotnet-6~git #5]] unresponsive for 14 min
2021-11-30 16:46:02.449+0000 [id=1596] INFO o.j.p.w.s.concurrent.Timeout#lambda$ping$0: Running CpsFlowExecutionOwner[SEPG/sepg/sepg~SEPGrelBasic~git/99:SEPG/sepg/sepg~SEPGrelBasic~git #99]] unresponsive for 19 min
2021-11-30 16:46:02.449+0000 [id=1596] INFO o.j.p.w.s.concurrent.Timeout#lambda$ping$0: Running CpsFlowExecutionOwner[SEPG/VTY/VTY~deploy-helm~git/651:SEPG/VTY/VTY~deploy-helm~git #651]] loading groovy/lang/GroovyObject$com$cloudbees$groovy$cps$hudson$model$Hudson.groovy unresponsive for 15 min
2021-11-30 16:46:02.449+0000 [id=1596] INFO o.j.p.w.s.concurrent.Timeout#lambda$ping$0: Running CpsFlowExecutionOwner[SEPG/VTY/VTY~vm-provisioning~git/216:SEPG/VTY/VTY~vm-provisioning~git #216]] unresponsive for 15 min
2021-11-30 16:46:02.450+0000 [id=1596] INFO o.j.p.w.s.concurrent.Timeout#lambda$ping$0: Running CpsFlowExecutionOwner[SEPG/IDLSolutions/IDLSolutions~eurus-ui~ISOL-883~CI/1:SEPG/IDLSolutions/IDLSolutions~eurus-ui~ISOL-883~CI #1]] unresponsive for 12 min
2021-11-30 16:46:02.450+0000 [id=1596] INFO o.j.p.w.s.concurrent.Timeout#lambda$ping$0: Running CpsFlowExecutionOwner[SEPG/VTY/VTY~ftp~master~CI/1012:SEPG/VTY/VTY~ftp~master~CI #1012]] unresponsive for 14 min
2021-11-30 16:46:02.450+0000 [id=1596] INFO o.j.p.w.s.concurrent.Timeout#lambda$ping$0: org.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep 447: checking /tmp/ws on sepg-nix03-DH unresponsive for 14 min
JENKINS_JAVA_OPTIONS="-Xms1024m \
-Xmx30720m \
-XX:+UseG1GC \
-XX:MaxMetaspaceSize=2048m \
-XX:MaxGCPauseMillis=200 \
-XX:+ParallelRefProcEnabled \
-XX:+AlwaysPreTouch \
-XX:+UseStringDeduplication \
-XX:+DisableExplicitGC \
-XX:+ExplicitGCInvokesConcurrent \
-Djava.awt.headless=true \"
We encounter this issue on a daily basis. Each time Jenkins does not continue any job, uses only one CPU core to about 100% and logs the following INFO message for any running job in the Jenkins log:
INFO: Running CpsFlowExecution[Owner[FOLDER/repo/branch/306:FOLDER/repo/branch #306]] unresponsive for 45 min
At the time I copied the log message above, Jenkins was already in this state for 45 min. After some more waiting, Jenkins seems to be able to break out of this seemlessly endless state.
Sometimes this state last for only a few minutes, sometime it lasts for about 1 hour.
I can provide more information if necessary, please just tell me what you need.
Jenkins version 2.334 running with Temurin Java 11.
Installed Plugins:
com/coravy/hudson/plugins/github/github/1.29.3/github-1.29.3.hpi
com/datapipe/jenkins/plugins/hashicorp-vault-plugin/336.v182c0fbaaeb7/hashicorp-vault-plugin-336.v182c0fbaaeb7.hpi
io/jenkins/blueocean/blueocean-bitbucket-pipeline/1.25.2/blueocean-bitbucket-pipeline-1.25.2.hpi
io/jenkins/blueocean/blueocean-commons/1.25.2/blueocean-commons-1.25.2.hpi
io/jenkins/blueocean/blueocean-config/1.25.2/blueocean-config-1.25.2.hpi
io/jenkins/blueocean/blueocean-core-js/1.25.2/blueocean-core-js-1.25.2.hpi
io/jenkins/blueocean/blueocean-dashboard/1.25.2/blueocean-dashboard-1.25.2.hpi
io/jenkins/blueocean/blueocean-events/1.25.2/blueocean-events-1.25.2.hpi
io/jenkins/blueocean/blueocean-git-pipeline/1.25.2/blueocean-git-pipeline-1.25.2.hpi
io/jenkins/blueocean/blueocean-github-pipeline/1.25.2/blueocean-github-pipeline-1.25.2.hpi
io/jenkins/blueocean/blueocean-i18n/1.25.2/blueocean-i18n-1.25.2.hpi
io/jenkins/blueocean/blueocean-jwt/1.25.2/blueocean-jwt-1.25.2.hpi
io/jenkins/blueocean/blueocean-personalization/1.25.2/blueocean-personalization-1.25.2.hpi
io/jenkins/blueocean/blueocean-pipeline-api-impl/1.25.2/blueocean-pipeline-api-impl-1.25.2.hpi
io/jenkins/blueocean/blueocean-pipeline-editor/1.25.2/blueocean-pipeline-editor-1.25.2.hpi
io/jenkins/blueocean/blueocean-pipeline-scm-api/1.25.2/blueocean-pipeline-scm-api-1.25.2.hpi
io/jenkins/blueocean/blueocean-rest-impl/1.25.2/blueocean-rest-impl-1.25.2.hpi
io/jenkins/blueocean/blueocean-rest/1.25.2/blueocean-rest-1.25.2.hpi
io/jenkins/blueocean/blueocean-web/1.25.2/blueocean-web-1.25.2.hpi
io/jenkins/blueocean/blueocean/1.25.2/blueocean-1.25.2.hpi
io/jenkins/blueocean/jenkins-design-language/1.25.2/jenkins-design-language-1.25.2.hpi
io/jenkins/configuration-as-code/1346.ve8cfa_3473c94/configuration-as-code-1346.ve8cfa_3473c94.hpi
io/jenkins/plugins/bootstrap4-api/4.6.0-3/bootstrap4-api-4.6.0-3.hpi
io/jenkins/plugins/bootstrap5-api/5.1.1-1/bootstrap5-api-5.1.1-1.hpi
io/jenkins/plugins/caffeine-api/2.9.2-29.v717aac953ff3/caffeine-api-2.9.2-29.v717aac953ff3.hpi
io/jenkins/plugins/checks-api/1.7.2/checks-api-1.7.2.hpi
io/jenkins/plugins/code-coverage-api/2.0.4/code-coverage-api-2.0.4.hpi
io/jenkins/plugins/configuration-as-code-groovy/1.1/configuration-as-code-groovy-1.1.hpi
io/jenkins/plugins/data-tables-api/1.11.3-1/data-tables-api-1.11.3-1.hpi
io/jenkins/plugins/echarts-api/5.2.1-2/echarts-api-5.2.1-2.hpi
io/jenkins/plugins/font-awesome-api/5.15.4-1/font-awesome-api-5.15.4-1.hpi
io/jenkins/plugins/forensics-api/1.5.0/forensics-api-1.5.0.hpi
io/jenkins/plugins/h2-api/1.4.199/h2-api-1.4.199.hpi
io/jenkins/plugins/javax-activation-api/1.2.0-2/javax-activation-api-1.2.0-2.hpi
io/jenkins/plugins/javax-mail-api/1.6.2-2/javax-mail-api-1.6.2-2.hpi
io/jenkins/plugins/jquery3-api/3.6.0-2/jquery3-api-3.6.0-2.hpi
io/jenkins/plugins/okhttp-api/3.12.12.2/okhttp-api-3.12.12.2.hpi
io/jenkins/plugins/plugin-util-api/2.5.1/plugin-util-api-2.5.1.hpi
io/jenkins/plugins/popper-api<popper-api-1.16.1-2.hpi
io/jenkins/plugins/popper2-api/2.10.1-1/popper2-api-2.10.1-1.hpi
io/jenkins/plugins/snakeyaml-api/1.29.1/snakeyaml-api-1.29.1.hpi
org/6wind/jenkins/lockable-resources/2.3/lockable-resources-2.3.hpi
org/csanchez/jenkins/plugins/kubernetes/1.31.3/kubernetes-1.31.3.hpi
org/jenkins-ci/main/maven-plugin/3.17/maven-plugin-3.17.hpi
org/jenkins-ci/modules/sshd/3.0.4/sshd-3.0.4.hpi
org/jenkins-ci/plugins/antisamy-markup-formatter/2.7/antisamy-markup-formatter-2.7.hpi
org/jenkins-ci/plugins/apache-httpcomponents-client-4-api/4.5.13-1.0/apache-httpcomponents-client-4-api-4.5.13-1.0.hpi
org/jenkins-ci/plugins/authentication-tokens/1.4/authentication-tokens-1.4.hpi
org/jenkins-ci/plugins/authorize-project/1.4.0/authorize-project-1.4.0.hpi
org/jenkins-ci/plugins/blueocean-autofavorite/1.2.4/blueocean-autofavorite-1.2.4.hpi
org/jenkins-ci/plugins/blueocean-display-url/2.4.1/blueocean-display-url-2.4.1.hpi
org/jenkins-ci/plugins/bouncycastle-api/2.25/bouncycastle-api-2.25.hpi
org/jenkins-ci/plugins/branch-api/2.7.0/branch-api-2.7.0.hpi
org/jenkins-ci/plugins/cloudbees-bitbucket-branch-source/751.vda_24678a_f781/cloudbees-bitbucket-branch-source-751.vda_24678a_f781.hpi
org/jenkins-ci/plugins/cloudbees-folder/6.17/cloudbees-folder-6.17.hpi
org/jenkins-ci/plugins/conditional-buildstep/1.4.1/conditional-buildstep-1.4.1.hpi
org/jenkins-ci/plugins/config-file-provider/3.9.0/config-file-provider-3.9.0.hpi
org/jenkins-ci/plugins/credentials-binding/1.27.1/credentials-binding-1.27.1.hpi
org/jenkins-ci/plugins/credentials/1074.v60e6c29b_b_44b_/credentials-1074.v60e6c29b_b_44b_.hpi
org/jenkins-ci/plugins/display-url-api/2.3.5/display-url-api-2.3.5.hpi
org/jenkins-ci/plugins/durable-task/493.v195aefbb0ff2/durable-task-493.v195aefbb0ff2.hpi
org/jenkins-ci/plugins/git-client/3.11.0/git-client-3.11.0.hpi
org/jenkins-ci/plugins/git-server/1.10/git-server-1.10.hpi
org/jenkins-ci/plugins/git/4.10.1/git-4.10.1.hpi
org/jenkins-ci/plugins/github-api/1.114.2/github-api-1.114.2.hpi
org/jenkins-ci/plugins/github-branch-source/2.6.0/github-branch-source-2.6.0.hpi
org/jenkins-ci/plugins/gradle/1.38/gradle-1.38.hpi
org/jenkins-ci/plugins/handy-uri-templates-2-api/2.1.8-1.0/handy-uri-templates-2-api-2.1.8-1.0.hpi
org/jenkins-ci/plugins/htmlpublisher/1.25/htmlpublisher-1.25.hpi
org/jenkins-ci/plugins/http_request/1.14/http_request-1.14.hpi
org/jenkins-ci/plugins/jackson2-api/2.13.1-246.va8a9f3eaf46a/jackson2-api-2.13.1-246.va8a9f3eaf46a.hpi
org/jenkins-ci/plugins/javadoc/1.6/javadoc-1.6.hpi
org/jenkins-ci/plugins/job-dsl/1.78.3/job-dsl-1.78.3.hpi
org/jenkins-ci/plugins/jsch/0.1.55.2/jsch-0.1.55.2.hpi
org/jenkins-ci/plugins/junit/1.53/junit-1.53.hpi
org/jenkins-ci/plugins/kubernetes-client-api/5.11.2-182.v0f1cf4c5904e/kubernetes-client-api-5.11.2-182.v0f1cf4c5904e.hpi
org/jenkins-ci/plugins/mailer/408.vd726a_1130320/mailer-408.vd726a_1130320.hpi
org/jenkins-ci/plugins/matrix-auth/3.0.1/matrix-auth-3.0.1.hpi
org/jenkins-ci/plugins/matrix-project/1.19/matrix-project-1.19.hpi
org/jenkins-ci/plugins/metrics/4.1.6.1/metrics-4.1.6.1.hpi
org/jenkins-ci/plugins/nodejs/1.5.1/nodejs-1.5.1.hpi
org/jenkins-ci/plugins/parameterized-trigger/2.43/parameterized-trigger-2.43.hpi
org/jenkins-ci/plugins/pipeline-build-step/2.15/pipeline-build-step-2.15.hpi
org/jenkins-ci/plugins/pipeline-graph-analysis/1.11/pipeline-graph-analysis-1.11.hpi
org/jenkins-ci/plugins/pipeline-input-step/446.vf27b_0b_83500e/pipeline-input-step-446.vf27b_0b_83500e.hpi
org/jenkins-ci/plugins/pipeline-maven/3.10.0/pipeline-maven-3.10.0.hpi
org/jenkins-ci/plugins/pipeline-milestone-step/1.3.2/pipeline-milestone-step-1.3.2.hpi
org/jenkins-ci/plugins/pipeline-stage-step/291.vf0a8a7aeeb50/pipeline-stage-step-291.vf0a8a7aeeb50.hpi
org/jenkins-ci/plugins/pipeline-stage-view/pipeline-rest-api/2.15/pipeline-rest-api-2.15.hpi
org/jenkins-ci/plugins/pipeline-stage-view/pipeline-stage-view/2.10/pipeline-stage-view-2.10.hpi
org/jenkins-ci/plugins/pipeline-utility-steps/2.12.0/pipeline-utility-steps-2.12.0.hpi
org/jenkins-ci/plugins/plain-credentials/1.8/plain-credentials-1.8.hpi
org/jenkins-ci/plugins/prometheus/2.0.10/prometheus-2.0.10.hpi
org/jenkins-ci/plugins/pubsub-light/1.16/pubsub-light-1.16.hpi
org/jenkins-ci/plugins/resource-disposer/0.16/resource-disposer-0.16.hpi
org/jenkins-ci/plugins/run-condition/1.3/run-condition-1.3.hpi
org/jenkins-ci/plugins/scm-api/595.vd5a_df5eb_0e39/scm-api-595.vd5a_df5eb_0e39.hpi
org/jenkins-ci/plugins/script-security/1131.v8b_b_5eda_c328e/script-security-1131.v8b_b_5eda_c328e.hpi
org/jenkins-ci/plugins/simple-theme-plugin/103.va_161d09c38c7/simple-theme-plugin-103.va_161d09c38c7.hpi
org/jenkins-ci/plugins/sonar/2.14/sonar-2.14.hpi
org/jenkins-ci/plugins/sse-gateway/1.24/sse-gateway-1.24.hpi
org/jenkins-ci/plugins/ssh-agent/1.24.1/ssh-agent-1.24.1.hpi
org/jenkins-ci/plugins/ssh-credentials/1.19/ssh-credentials-1.19.hpi
org/jenkins-ci/plugins/ssh-steps/2.0.0/ssh-steps-2.0.0.hpi
org/jenkins-ci/plugins/structs/308.v852b473a2b8c/structs-308.v852b473a2b8c.hpi
org/jenkins-ci/plugins/timestamper/1.17/timestamper-1.17.hpi
org/jenkins-ci/plugins/token-macro/277.v7c8f82a_d66b_3/token-macro-277.v7c8f82a_d66b_3.hpi
org/jenkins-ci/plugins/trilead-api/1.0.13/trilead-api-1.0.13.hpi
org/jenkins-ci/plugins/variant/1.4/variant-1.4.hpi
org/jenkins-ci/plugins/webhook-step/89.vfa4b9e961ebf/webhook-step-89.vfa4b9e961ebf.hpi
org/jenkins-ci/plugins/workflow/workflow-aggregator/2.6/workflow-aggregator-2.6.hpi
org/jenkins-ci/plugins/workflow/workflow-api/1136.v7f5f1759dc16/workflow-api-1136.v7f5f1759dc16.hpi
org/jenkins-ci/plugins/workflow/workflow-basic-steps/2.24/workflow-basic-steps-2.24.hpi
org/jenkins-ci/plugins/workflow/workflow-cps-global-lib/552.vd9cc05b8a2e1/workflow-cps-global-lib-552.vd9cc05b8a2e1.hpi
org/jenkins-ci/plugins/workflow/workflow-cps/2648.va9433432b33c/workflow-cps-2648.va9433432b33c.hpi
org/jenkins-ci/plugins/workflow/workflow-durable-task-step/1121.va_65b_d2701486/workflow-durable-task-step-1121.va_65b_d2701486.hpi
org/jenkins-ci/plugins/workflow/workflow-job/1145.v7f2433caa07f/workflow-job-1145.v7f2433caa07f.hpi
org/jenkins-ci/plugins/workflow/workflow-multibranch/706.vd43c65dec013/workflow-multibranch-706.vd43c65dec013.hpi
org/jenkins-ci/plugins/workflow/workflow-scm-step/2.13/workflow-scm-step-2.13.hpi
org/jenkins-ci/plugins/workflow/workflow-step-api/622.vb_8e7c15b_c95a_/workflow-step-api-622.vb_8e7c15b_c95a_.hpi
org/jenkins-ci/plugins/workflow/workflow-support/813.vb_d7c3d2984a_0/workflow-support-813.vb_d7c3d2984a_0.hpi
org/jenkins-ci/plugins/ws-cleanup/0.40/ws-cleanup-0.40.hpi
org/jenkins-ci/ui/ace-editor/1.1/ace-editor-1.1.hpi
org/jenkins-ci/ui/handlebars/1.1/handlebars-1.1.hpi
org/jenkins-ci/ui/jquery-detached/1.2.1/jquery-detached-1.2.1.hpi
org/jenkins-ci/ui/momentjs/1.1/momentjs-1.1.hpi
org/jenkinsci/plugins/kubernetes-credentials/0.9.0/kubernetes-credentials-0.9.0.hpi
org/jenkinsci/plugins/pipeline-model-api/2.2064.v5eef7d0982b_e/pipeline-model-api-2.2064.v5eef7d0982b_e.hpi
org/jenkinsci/plugins/pipeline-model-definition/2.2064.v5eef7d0982b_e/pipeline-model-definition-2.2064.v5eef7d0982b_e.hpi
org/jenkinsci/plugins/pipeline-model-extensions/2.2064.v5eef7d0982b_e/pipeline-model-extensions-2.2064.v5eef7d0982b_e.hpi
org/jenkinsci/plugins/pipeline-stage-tags-metadata/2.2064.v5eef7d0982b_e/pipeline-stage-tags-metadata-2.2064.v5eef7d0982b_e.hpi
org/jvnet/hudson/plugins/favorite/2.3.1/favorite-2.3.1.hpi
com/oneandone/jenkins/intranet-login-plugin/2.7/intranet-login-plugin-2.7.hpi
Edit: Root cause was the performance plugin running perfReport on ~240mb worth of test data.
We're experiencing a similar issue for the past 2 days.
Attached is a thread dump. ganthore-threads.dump
We're running on core 2.344 and latest pipelines plugins.
Version: Jenkins 2.355 with latest plugins.
Problem:
- Some job cannot finish even if it's actually done, and keeps logging same output
- Nodes of those jobs report unresponsive for 5 sec/ 10 sec... ,and the time can reset to 5 sec(means it's not a dead lock)
- Other job on those node cannot start or finish
- jstack jenkins, jobs are waiting for a thread. And there is tee in this thread backtrace. (I forgot to snapshot/save it...)
Reason:
- We use tee in pipeline
- Maybe cmd in tee does not close fd properly, or unstable network(packet loss for EOF), or tee self has bugs
- Such huge and infinite outputs occupy the lock for too long.
- The node start unresponsive, and the unresponsive time will reset when lock is get.
- Jobs on those node cannot start or finish.
After we remove all tee in all jobs, the problem disappeared. But the reason might be different in other user's reports.
If you got tee in your backtrace, might be the tee problem, you should try to remove it.
Report by: Aliyun PolarDB Testing team.
We are having this same problem, I'm not sure if it's related to the CPS issue of the original subject or not. Several times a day, jobs will just stop, at various spots in their pipeline. These are all declarative pipelines, Linux controller, windows agent. There is plenty of disk space.
Jenkins 2.346.2
mostly all updated plugins
that -Xmx256m looks suspiciously low
This is from the jenkins.xml:
<executable>/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.262.b10-0.el7_8.x86_64/jre/bin</executable> --> <arguments>-Xrs -Xmx256m -Dhudson.lifecycle=hudson.lifecycle.WindowsServiceLifecycle -Djsse.enableSNIExtension=false -jar "%BASE%\jenkins.war" --httpPort=8080</arguments>
memory:
$ free -h total used free shared buff/cache available Mem: 7.6G 3.6G 1.2G 256M 2.9G 3.5G Swap: 1.0G 345M 678M
Should we bump up the -Xmx to 4g?
mcascone , Not sure whether you have already fixed it. Try updating to openjdk 11
touseef We're experiencing the same issue. Do you mean updating the jdk version will help us fix the issue?
Which specific version helps? The latest one?
The issue is still present with Jenkins 2.401.1 running with Java Temurin 17.0.8.
Installed plugins:
com/coravy/hudson/plugins/github/github/1.37.3/github-1.37.3.hpi io/jenkins/blueocean/blueocean-bitbucket-pipeline/1.27.5/blueocean-bitbucket-pipeline-1.27.5.hpi io/jenkins/blueocean/blueocean-commons/1.27.5/blueocean-commons-1.27.5.hpi io/jenkins/blueocean/blueocean-config/1.27.5/blueocean-config-1.27.5.hpi io/jenkins/blueocean/blueocean-core-js/1.27.5/blueocean-core-js-1.27.5.hpi io/jenkins/blueocean/blueocean-dashboard/1.27.5/blueocean-dashboard-1.27.5.hpi io/jenkins/blueocean/blueocean-events/1.27.5/blueocean-events-1.27.5.hpi io/jenkins/blueocean/blueocean-git-pipeline/1.27.5/blueocean-git-pipeline-1.27.5.hpi io/jenkins/blueocean/blueocean-github-pipeline/1.27.5/blueocean-github-pipeline-1.27.5.hpi io/jenkins/blueocean/blueocean-i18n/1.27.5/blueocean-i18n-1.27.5.hpi io/jenkins/blueocean/blueocean-jwt/1.27.5/blueocean-jwt-1.27.5.hpi io/jenkins/blueocean/blueocean-personalization/1.27.5/blueocean-personalization-1.27.5.hpi io/jenkins/blueocean/blueocean-pipeline-api-impl/1.27.5/blueocean-pipeline-api-impl-1.27.5.hpi io/jenkins/blueocean/blueocean-pipeline-editor/1.27.5/blueocean-pipeline-editor-1.27.5.hpi io/jenkins/blueocean/blueocean-pipeline-scm-api/1.27.5/blueocean-pipeline-scm-api-1.27.5.hpi io/jenkins/blueocean/blueocean-rest-impl/1.27.5/blueocean-rest-impl-1.27.5.hpi io/jenkins/blueocean/blueocean-rest/1.27.5/blueocean-rest-1.27.5.hpi io/jenkins/blueocean/blueocean-web/1.27.5/blueocean-web-1.27.5.hpi io/jenkins/blueocean/blueocean/1.27.5/blueocean-1.27.5.hpi io/jenkins/blueocean/jenkins-design-language/1.27.5/jenkins-design-language-1.27.5.hpi io/jenkins/configuration-as-code/1670.v564dc8b_982d0/configuration-as-code-1670.v564dc8b_982d0.hpi io/jenkins/plugins/bootstrap5-api/5.2.2-4/bootstrap5-api-5.2.2-4.hpi io/jenkins/plugins/caffeine-api/3.1.6-115.vb_8b_b_328e59d8/caffeine-api-3.1.6-115.vb_8b_b_328e59d8.hpi io/jenkins/plugins/checks-api/1.8.1/checks-api-1.8.1.hpi io/jenkins/plugins/commons-lang3-api/3.13.0-62.v7d18e55f51e2/commons-lang3-api-3.13.0-62.v7d18e55f51e2.hpi io/jenkins/plugins/commons-text-api/1.10.0-36.vc008c8fcda_7b_/commons-text-api-1.10.0-36.vc008c8fcda_7b_.hpi io/jenkins/plugins/configuration-as-code-groovy/1.1/configuration-as-code-groovy-1.1.hpi io/jenkins/plugins/data-tables-api/1.13.5-1/data-tables-api-1.13.5-1.hpi io/jenkins/plugins/echarts-api/5.4.0-1/echarts-api-5.4.0-1.hpi io/jenkins/plugins/font-awesome-api/6.3.0-2/font-awesome-api-6.3.0-2.hpi io/jenkins/plugins/ionicons-api/56.v1b_1c8c49374e/ionicons-api-56.v1b_1c8c49374e.hpi io/jenkins/plugins/jakarta-activation-api/2.0.1-3/jakarta-activation-api-2.0.1-3.hpi io/jenkins/plugins/jakarta-mail-api/2.0.1-3/jakarta-mail-api-2.0.1-3.hpi io/jenkins/plugins/javax-activation-api/1.2.0-6/javax-activation-api-1.2.0-6.hpi io/jenkins/plugins/jaxb/2.3.8-1/jaxb-2.3.8-1.hpi io/jenkins/plugins/jjwt-api/0.11.5-77.v646c772fddb_0/jjwt-api-0.11.5-77.v646c772fddb_0.hpi io/jenkins/plugins/jquery3-api/3.7.0-1/jquery3-api-3.7.0-1.hpi io/jenkins/plugins/mina-sshd-api/mina-sshd-api-common/2.10.0-69.v28e3e36d18eb_/mina-sshd-api-common-2.10.0-69.v28e3e36d18eb_.hpi io/jenkins/plugins/mina-sshd-api/mina-sshd-api-core/2.10.0-69.v28e3e36d18eb_/mina-sshd-api-core-2.10.0-69.v28e3e36d18eb_.hpi io/jenkins/plugins/okhttp-api/4.11.0-157.v6852a_a_fa_ec11/okhttp-api-4.11.0-157.v6852a_a_fa_ec11.hpi io/jenkins/plugins/pipeline-graph-view/191.vc6da_9d3eb_70a/pipeline-graph-view-191.vc6da_9d3eb_70a.hpi io/jenkins/plugins/pipeline-groovy-lib/671.v07c339c842e8/pipeline-groovy-lib-671.v07c339c842e8.hpi io/jenkins/plugins/plugin-util-api/3.2.1/plugin-util-api-3.2.1.hpi io/jenkins/plugins/popper2-api/2.11.6-1/popper2-api-2.11.6-1.hpi io/jenkins/plugins/snakeyaml-api/1.33-95.va_b_a_e3e47b_fa_4/snakeyaml-api-1.33-95.va_b_a_e3e47b_fa_4.hpi org/6wind/jenkins/lockable-resources/1185.v0c528656ce04/lockable-resources-1185.v0c528656ce04.hpi org/jenkins-ci/modules/instance-identity/173.va_37c494ec4e5/instance-identity-173.va_37c494ec4e5.hpi org/jenkins-ci/plugins/antisamy-markup-formatter/159.v25b_c67cd35fb_/antisamy-markup-formatter-159.v25b_c67cd35fb_.hpi org/jenkins-ci/plugins/apache-httpcomponents-client-4-api/4.5.14-150.v7a_b_9d17134a_5/apache-httpcomponents-client-4-api-4.5.14-150.v7a_b_9d17134a_5.hpi org/jenkins-ci/plugins/authentication-tokens/1.53.v1c90fd9191a_b_/authentication-tokens-1.53.v1c90fd9191a_b_.hpi org/jenkins-ci/plugins/authorize-project/1.7.1/authorize-project-1.7.1.hpi org/jenkins-ci/plugins/badge/1.9.1/badge-1.9.1.hpi org/jenkins-ci/plugins/blueocean-display-url/2.4.1/blueocean-display-url-2.4.1.hpi org/jenkins-ci/plugins/bouncycastle-api/2.27/bouncycastle-api-2.27.hpi org/jenkins-ci/plugins/branch-api/2.1092.vda_3c2a_a_f0c11/branch-api-2.1092.vda_3c2a_a_f0c11.hpi org/jenkins-ci/plugins/build-with-parameters/76.v9382db_f78962/build-with-parameters-76.v9382db_f78962.hpi org/jenkins-ci/plugins/cloudbees-bitbucket-branch-source/825.va_6a_dc46a_f97d/cloudbees-bitbucket-branch-source-825.va_6a_dc46a_f97d.hpi org/jenkins-ci/plugins/cloudbees-folder/6.846.v23698686f0f6/cloudbees-folder-6.846.v23698686f0f6.hpi org/jenkins-ci/plugins/credentials-binding/604.vb_64480b_c56ca_/credentials-binding-604.vb_64480b_c56ca_.hpi org/jenkins-ci/plugins/credentials/1271.v54b_1c2c6388a_/credentials-1271.v54b_1c2c6388a_.hpi org/jenkins-ci/plugins/display-url-api/2.3.8/display-url-api-2.3.8.hpi org/jenkins-ci/plugins/durable-task/500.v8927d9fd99d8/durable-task-500.v8927d9fd99d8.hpi org/jenkins-ci/plugins/git-client/4.4.0/git-client-4.4.0.hpi org/jenkins-ci/plugins/git/5.2.0/git-5.2.0.hpi org/jenkins-ci/plugins/github-api/1.314-431.v78d72a_3fe4c3/github-api-1.314-431.v78d72a_3fe4c3.hpi org/jenkins-ci/plugins/github-branch-source/1703.vd5a_2b_29c6cdc/github-branch-source-1703.vd5a_2b_29c6cdc.hpi org/jenkins-ci/plugins/handy-uri-templates-2-api/2.1.8-22.v77d5b_75e6953/handy-uri-templates-2-api-2.1.8-22.v77d5b_75e6953.hpi org/jenkins-ci/plugins/htmlpublisher/1.31/htmlpublisher-1.31.hpi org/jenkins-ci/plugins/http_request/1.18/http_request-1.18.hpi org/jenkins-ci/plugins/jackson2-api/2.15.2-350.v0c2f3f8fc595/jackson2-api-2.15.2-350.v0c2f3f8fc595.hpi org/jenkins-ci/plugins/job-dsl/1.84/job-dsl-1.84.hpi org/jenkins-ci/plugins/junit/1202.v79a_986785076/junit-1202.v79a_986785076.hpi org/jenkins-ci/plugins/mailer/463.vedf8358e006b_/mailer-463.vedf8358e006b_.hpi org/jenkins-ci/plugins/matrix-auth/3.1.10/matrix-auth-3.1.10.hpi org/jenkins-ci/plugins/matrix-project/789.v57a_725b_63c79/matrix-project-789.v57a_725b_63c79.hpi org/jenkins-ci/plugins/metrics/4.2.18-442.v02e107157925/metrics-4.2.18-442.v02e107157925.hpi org/jenkins-ci/plugins/parameterized-trigger/2.46/parameterized-trigger-2.46.hpi org/jenkins-ci/plugins/pipeline-build-step/491.v1fec530da_858/pipeline-build-step-491.v1fec530da_858.hpi org/jenkins-ci/plugins/pipeline-graph-analysis/202.va_d268e64deb_3/pipeline-graph-analysis-202.va_d268e64deb_3.hpi org/jenkins-ci/plugins/pipeline-input-step/468.va_5db_051498a_4/pipeline-input-step-468.va_5db_051498a_4.hpi org/jenkins-ci/plugins/pipeline-milestone-step/101.vd572fef9d926/pipeline-milestone-step-101.vd572fef9d926.hpi org/jenkins-ci/plugins/pipeline-stage-step/305.ve96d0205c1c6/pipeline-stage-step-305.ve96d0205c1c6.hpi org/jenkins-ci/plugins/pipeline-stage-view/pipeline-rest-api/2.33/pipeline-rest-api-2.33.hpi org/jenkins-ci/plugins/pipeline-stage-view/pipeline-stage-view/2.33/pipeline-stage-view-2.33.hpi org/jenkins-ci/plugins/pipeline-utility-steps/2.16.0/pipeline-utility-steps-2.16.0.hpi org/jenkins-ci/plugins/plain-credentials/143.v1b_df8b_d3b_e48/plain-credentials-143.v1b_df8b_d3b_e48.hpi org/jenkins-ci/plugins/pubsub-light/1.16/pubsub-light-1.16.hpi org/jenkins-ci/plugins/resource-disposer/0.21/resource-disposer-0.21.hpi org/jenkins-ci/plugins/scm-api/676.v886669a_199a_a_/scm-api-676.v886669a_199a_a_.hpi org/jenkins-ci/plugins/script-security/1251.vfe552ed55f8d/script-security-1251.vfe552ed55f8d.hpi org/jenkins-ci/plugins/sse-gateway/1.24/sse-gateway-1.24.hpi org/jenkins-ci/plugins/ssh-credentials/305.v8f4381501156/ssh-credentials-305.v8f4381501156.hpi org/jenkins-ci/plugins/ssh-steps/2.0.65.vd26b_5b_9b_de4d/ssh-steps-2.0.65.vd26b_5b_9b_de4d.hpi org/jenkins-ci/plugins/stashNotifier/1.28/stashNotifier-1.28.hpi org/jenkins-ci/plugins/structs/324.va_f5d6774f3a_d/structs-324.va_f5d6774f3a_d.hpi org/jenkins-ci/plugins/swarm/3.40/swarm-3.40.hpi org/jenkins-ci/plugins/timestamper/1.26/timestamper-1.26.hpi org/jenkins-ci/plugins/token-macro/383.v36161104b_002/token-macro-383.v36161104b_002.hpi org/jenkins-ci/plugins/trilead-api/2.84.v72119de229b_7/trilead-api-2.84.v72119de229b_7.hpi org/jenkins-ci/plugins/variant/59.vf075fe829ccb/variant-59.vf075fe829ccb.hpi org/jenkins-ci/plugins/webhook-step/173.vfa_b_93560b_977/webhook-step-173.vfa_b_93560b_977.hpi org/jenkins-ci/plugins/workflow/workflow-aggregator/596.v8c21c963d92d/workflow-aggregator-596.v8c21c963d92d.hpi org/jenkins-ci/plugins/workflow/workflow-api/1215.v2b_ee3e1b_dd39/workflow-api-1215.v2b_ee3e1b_dd39.hpi org/jenkins-ci/plugins/workflow/workflow-basic-steps/1042.ve7b_140c4a_e0c/workflow-basic-steps-1042.ve7b_140c4a_e0c.hpi org/jenkins-ci/plugins/workflow/workflow-cps/3731.ve4b_5b_857b_a_d3/workflow-cps-3731.ve4b_5b_857b_a_d3.hpi org/jenkins-ci/plugins/workflow/workflow-durable-task-step/1247.v7f9dfea_b_4fd0/workflow-durable-task-step-1247.v7f9dfea_b_4fd0.hpi org/jenkins-ci/plugins/workflow/workflow-job/1308.v58d48a_763b_31/workflow-job-1308.v58d48a_763b_31.hpi org/jenkins-ci/plugins/workflow/workflow-multibranch/746.v05814d19c001/workflow-multibranch-746.v05814d19c001.hpi org/jenkins-ci/plugins/workflow/workflow-scm-step/415.v434365564324/workflow-scm-step-415.v434365564324.hpi org/jenkins-ci/plugins/workflow/workflow-step-api/639.v6eca_cd8c04a_a_/workflow-step-api-639.v6eca_cd8c04a_a_.hpi org/jenkins-ci/plugins/workflow/workflow-support/839.v35e2736cfd5c/workflow-support-839.v35e2736cfd5c.hpi org/jenkins-ci/plugins/ws-cleanup/0.45/ws-cleanup-0.45.hpi org/jenkinsci/plugins/pipeline-model-api/2.2144.v077a_d1928a_40/pipeline-model-api-2.2144.v077a_d1928a_40.hpi org/jenkinsci/plugins/pipeline-model-definition/2.2144.v077a_d1928a_40/pipeline-model-definition-2.2144.v077a_d1928a_40.hpi org/jenkinsci/plugins/pipeline-model-extensions/2.2144.v077a_d1928a_40/pipeline-model-extensions-2.2144.v077a_d1928a_40.hpi org/jenkinsci/plugins/pipeline-stage-tags-metadata/2.2144.v077a_d1928a_40/pipeline-stage-tags-metadata-2.2144.v077a_d1928a_40.hpi org/jvnet/hudson/plugins/favorite/2.3.1/favorite-2.3.1.hpi org/jvnet/hudson/plugins/thinBackup/1.18/thinBackup-1.18.hpi
When we saw those log messages (as in the summary of this ticket) we had an issue with the call to build history of an agent, see ticket JENKINS-72138. Maybe the blocking of threads is related to this.
Same here, several different pipelines appear in the log with the "unresponsive for..." message
Dec 21 15:12:42 <host> jenkins[2263991]: 2023-12-21 13:12:42.225+0000 [id=1525] INFO o.j.p.w.s.concurrent.Timeout#lambda$ping$0: Running CpsFlowExecution[Owner[<jenkins-pipeline-name> #36]] unresponsive for 1 day 4 hr Dec 21 15:12:42 <host> jenkins[2263991]: 2023-12-21 13:12:42.227+0000 [id=1525] INFO o.j.p.w.s.concurrent.Timeout#lambda$ping$0: Running CpsFlowExecution[Owner[<jenkins-pipeline-name> #34]] unresponsive for 3 days 5 hr
Even though I aborted some of them manually they still keep appearing on the log
I started seeing it recently NOT following core/plugins upgrade
Jenkins version: 2.429
Java version: openjdk 11.0.21
Jenkins is running on GCP e2-highmem-4 (4-CPU, 32-RAM) with "-Xms2G -Xmx16G -XX:GCTimeRatio=14 -XX:SoftRefLRUPolicyMSPerMB=50 -XX:+UseG1GC"
markwaite any idea what can be the root-cause or how to troubleshoot this issue?
This issue was reported by many users (for the first time in 2018)
benipeled asked:
any idea what can be the root-cause or how to troubleshoot this issue?
No idea from me and no suggestions on how to troubleshoot the issue.
Is it possible, that the error message looks different with Jenkins 2.479.2 and Java 21?
2024-12-10 08:19:47.008+0000 [id=127127] WARNING o.j.p.w.cps.CpsFlowExecution#blocksRestart: Not blocking restart due to problem checking running steps in CpsFlowExecution[Owner[SOME_JOB_NAME #232]] java.util.concurrent.TimeoutException: Waited 1 seconds (plus 66615 nanoseconds delay) for SettableFuture@7b535cf5[status=PENDING] at com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:534) at com.google.common.util.concurrent.AbstractFuture$TrustedFuture.get(AbstractFuture.java:119) at PluginClassLoader for workflow-cps//org.jenkinsci.plugins.workflow.cps.CpsFlowExecution.blocksRestart(CpsFlowExecution.java:1028) at PluginClassLoader for workflow-job//org.jenkinsci.plugins.workflow.job.WorkflowRun$2.blocksRestart(WorkflowRun.java:407) at PluginClassLoader for workflow-job//org.jenkinsci.plugins.workflow.job.WorkflowRun$2.displayCell(WorkflowRun.java:410) at hudson.model.Executor.isDisplayCell(Executor.java:685) at hudson.model.Computer.getDisplayExecutors(Computer.java:1016) at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103) at java.base/java.lang.reflect.Method.invoke(Method.java:580) at org.apache.commons.jexl.util.PropertyExecutor.execute(PropertyExecutor.java:125) at org.apache.commons.jexl.util.introspection.UberspectImpl$VelGetterImpl.invoke(UberspectImpl.java:314) at org.apache.commons.jexl.parser.ASTArrayAccess.evaluateExpr(ASTArrayAccess.java:185) at org.apache.commons.jexl.parser.ASTIdentifier.execute(ASTIdentifier.java:75) at org.apache.commons.jexl.parser.ASTReference.execute(ASTReference.java:83) at org.apache.commons.jexl.parser.ASTReference.value(ASTReference.java:57) at org.apache.commons.jexl.parser.ASTReferenceExpression.value(ASTReferenceExpression.java:51) at org.apache.commons.jexl.ExpressionImpl.evaluate(ExpressionImpl.java:80) at hudson.ExpressionFactory2$JexlExpression.evaluate(ExpressionFactory2.java:76) at org.apache.commons.jelly.tags.core.CoreTagLibrary$3.run(CoreTagLibrary.java:134) at org.apache.commons.jelly.impl.ScriptBlock.run(ScriptBlock.java:95) at org.kohsuke.stapler.jelly.ReallyStaticTagLibrary$1.run(ReallyStaticTagLibrary.java:102) at org.apache.commons.jelly.TagSupport.invokeBody(TagSupport.java:161) at org.apache.commons.jelly.tags.core.ForEachTag.doTag(ForEachTag.java:150) at org.apache.commons.jelly.impl.TagScript.run(TagScript.java:271) at org.apache.commons.jelly.TagSupport.invokeBody(TagSupport.java:161) at org.apache.commons.jelly.tags.core.OtherwiseTag.doTag(OtherwiseTag.java:41) at org.apache.commons.jelly.impl.TagScript.run(TagScript.java:271) at org.apache.commons.jelly.impl.ScriptBlock.run(ScriptBlock.java:95) at org.kohsuke.stapler.jelly.ReallyStaticTagLibrary$1.run(ReallyStaticTagLibrary.java:102) at org.apache.commons.jelly.TagSupport.invokeBody(TagSupport.java:161) at org.apache.commons.jelly.tags.core.ChooseTag.doTag(ChooseTag.java:38) at org.apache.commons.jelly.impl.TagScript.run(TagScript.java:271) at org.apache.commons.jelly.impl.ScriptBlock.run(ScriptBlock.java:95) at org.kohsuke.stapler.jelly.ReallyStaticTagLibrary$1.run(ReallyStaticTagLibrary.java:102) at org.apache.commons.jelly.impl.ScriptBlock.run(ScriptBlock.java:95) at org.apache.commons.jelly.tags.core.CoreTagLibrary$2.run(CoreTagLibrary.java:105) at org.kohsuke.stapler.jelly.CallTagLibScript.run(CallTagLibScript.java:121) at org.apache.commons.jelly.tags.core.CoreTagLibrary$2.run(CoreTagLibrary.java:105) at org.kohsuke.stapler.jelly.JellyViewScript.run(JellyViewScript.java:98) at org.kohsuke.stapler.jelly.IncludeTag.doTag(IncludeTag.java:174) at org.apache.commons.jelly.impl.TagScript.run(TagScript.java:271) at org.kohsuke.stapler.jelly.CallTagLibScript$1.run(CallTagLibScript.java:100) at org.apache.commons.jelly.tags.define.InvokeBodyTag.doTag(InvokeBodyTag.java:91) at org.apache.commons.jelly.impl.TagScript.run(TagScript.java:271) at org.apache.commons.jelly.impl.ScriptBlock.run(ScriptBlock.java:95) at org.kohsuke.stapler.jelly.CallTagLibScript$1.run(CallTagLibScript.java:100) at org.apache.commons.jelly.tags.define.InvokeBodyTag.doTag(InvokeBodyTag.java:91) at org.apache.commons.jelly.impl.TagScript.run(TagScript.java:271) at org.apache.commons.jelly.impl.ScriptBlock.run(ScriptBlock.java:95) at org.apache.commons.jelly.tags.core.CoreTagLibrary$2.run(CoreTagLibrary.java:105) at org.kohsuke.stapler.jelly.CallTagLibScript.run(CallTagLibScript.java:121) at org.apache.commons.jelly.impl.ScriptBlock.run(ScriptBlock.java:95) at org.apache.commons.jelly.tags.core.CoreTagLibrary$1.run(CoreTagLibrary.java:98) at org.apache.commons.jelly.impl.ScriptBlock.run(ScriptBlock.java:95) at org.apache.commons.jelly.tags.core.CoreTagLibrary$2.run(CoreTagLibrary.java:105) at org.kohsuke.stapler.jelly.CallTagLibScript.run(CallTagLibScript.java:121) at org.apache.commons.jelly.tags.core.CoreTagLibrary$2.run(CoreTagLibrary.java:105) at org.kohsuke.stapler.jelly.JellyViewScript.run(JellyViewScript.java:98) at org.kohsuke.stapler.jelly.DefaultScriptInvoker.invokeScript(DefaultScriptInvoker.java:67) at org.kohsuke.stapler.jelly.DefaultScriptInvoker.invokeScript(DefaultScriptInvoker.java:55) at org.kohsuke.stapler.jelly.ScriptInvoker.execute(ScriptInvoker.java:62) at org.kohsuke.stapler.jelly.ScriptInvoker.execute(ScriptInvoker.java:42) at org.kohsuke.stapler.Facet$1.dispatch(Facet.java:230) at org.kohsuke.stapler.Stapler.tryInvoke(Stapler.java:800) at org.kohsuke.stapler.Stapler.invoke(Stapler.java:938) at org.kohsuke.stapler.MetaClass$5.doDispatch(MetaClass.java:369) at org.kohsuke.stapler.NameBasedDispatcher.dispatch(NameBasedDispatcher.java:61) at org.kohsuke.stapler.Stapler.tryInvoke(Stapler.java:800) at org.kohsuke.stapler.Stapler.invoke(Stapler.java:938) at org.kohsuke.stapler.Stapler.tryInvoke(Stapler.java:871) at org.kohsuke.stapler.Stapler.invoke(Stapler.java:938) at org.kohsuke.stapler.MetaClass$5.doDispatch(MetaClass.java:369) at org.kohsuke.stapler.NameBasedDispatcher.dispatch(NameBasedDispatcher.java:61) at org.kohsuke.stapler.Stapler.tryInvoke(Stapler.java:800) at org.kohsuke.stapler.Stapler.invoke(Stapler.java:938) at org.kohsuke.stapler.Stapler.invoke(Stapler.java:721) at org.kohsuke.stapler.Stapler.service(Stapler.java:253) at Jenkins Main ClassLoader//jakarta.servlet.http.HttpServlet.service(HttpServlet.java:587) at Jenkins Main ClassLoader//org.eclipse.jetty.ee9.servlet.ServletHolder.handle(ServletHolder.java:765) at Jenkins Main ClassLoader//org.eclipse.jetty.ee9.servlet.ServletHandler$ChainEnd.doFilter(ServletHandler.java:1668) at hudson.util.PluginServletFilter$1.doFilter(PluginServletFilter.java:204) at io.jenkins.servlet.FilterChainWrapper$2.doFilter(FilterChainWrapper.java:53) at PluginClassLoader for sse-gateway//org.jenkinsci.plugins.ssegateway.Endpoint$SSEListenChannelFilter.doFilter(Endpoint.java:248) at io.jenkins.servlet.FilterWrapper$1.doFilter(FilterWrapper.java:42) at hudson.util.PluginServletFilter$1.doFilter(PluginServletFilter.java:201) at io.jenkins.servlet.FilterChainWrapper$2.doFilter(FilterChainWrapper.java:53) at PluginClassLoader for blueocean-web//io.jenkins.blueocean.ResourceCacheControl.doFilter(ResourceCacheControl.java:134) at io.jenkins.servlet.FilterWrapper$1.doFilter(FilterWrapper.java:42) at hudson.util.PluginServletFilter$1.doFilter(PluginServletFilter.java:201) at io.jenkins.servlet.FilterChainWrapper$2.doFilter(FilterChainWrapper.java:53) at PluginClassLoader for blueocean-jwt//io.jenkins.blueocean.auth.jwt.impl.JwtAuthenticationFilter.doFilter(JwtAuthenticationFilter.java:60) at io.jenkins.servlet.FilterWrapper$1.doFilter(FilterWrapper.java:42) at hudson.util.PluginServletFilter$1.doFilter(PluginServletFilter.java:201) at io.jenkins.servlet.FilterChainWrapper$2.doFilter(FilterChainWrapper.java:53) at PluginClassLoader for metrics//jenkins.metrics.impl.MetricsFilter.doFilter(MetricsFilter.java:125) at io.jenkins.servlet.FilterWrapper$1.doFilter(FilterWrapper.java:42) at hudson.util.PluginServletFilter$1.doFilter(PluginServletFilter.java:201) at jenkins.util.HttpServletFilter$1.doFilter(HttpServletFilter.java:77) at hudson.util.PluginServletFilter$1.doFilter(PluginServletFilter.java:201) at hudson.util.PluginServletFilter.doFilter(PluginServletFilter.java:207) at Jenkins Main ClassLoader//org.eclipse.jetty.ee9.servlet.FilterHolder.doFilter(FilterHolder.java:202) at Jenkins Main ClassLoader//org.eclipse.jetty.ee9.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1638) at jenkins.ErrorAttributeFilter.doFilter(ErrorAttributeFilter.java:29) at Jenkins Main ClassLoader//org.eclipse.jetty.ee9.servlet.FilterHolder.doFilter(FilterHolder.java:202) at Jenkins Main ClassLoader//org.eclipse.jetty.ee9.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1638) at hudson.security.csrf.CrumbFilter.doFilter(CrumbFilter.java:154) at Jenkins Main ClassLoader//org.eclipse.jetty.ee9.servlet.FilterHolder.doFilter(FilterHolder.java:202) at Jenkins Main ClassLoader//org.eclipse.jetty.ee9.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1638) at hudson.security.ChainedServletFilter2$1.doFilter(ChainedServletFilter2.java:94) at jenkins.security.AcegiSecurityExceptionFilter.doFilter(AcegiSecurityExceptionFilter.java:52) at hudson.security.ChainedServletFilter2$1.doFilter(ChainedServletFilter2.java:99) at hudson.security.UnwrapSecurityExceptionFilter.doFilter(UnwrapSecurityExceptionFilter.java:54) at hudson.security.ChainedServletFilter2$1.doFilter(ChainedServletFilter2.java:99) at org.springframework.security.web.access.ExceptionTranslationFilter.doFilter(ExceptionTranslationFilter.java:126) at org.springframework.security.web.access.ExceptionTranslationFilter.doFilter(ExceptionTranslationFilter.java:120) at hudson.security.ChainedServletFilter2$1.doFilter(ChainedServletFilter2.java:99) at org.springframework.security.web.authentication.AnonymousAuthenticationFilter.doFilter(AnonymousAuthenticationFilter.java:100) at hudson.security.ChainedServletFilter2$1.doFilter(ChainedServletFilter2.java:99) at org.springframework.security.web.authentication.rememberme.RememberMeAuthenticationFilter.doFilter(RememberMeAuthenticationFilter.java:110) at org.springframework.security.web.authentication.rememberme.RememberMeAuthenticationFilter.doFilter(RememberMeAuthenticationFilter.java:101) at hudson.security.ChainedServletFilter2$1.doFilter(ChainedServletFilter2.java:99) at org.springframework.security.web.authentication.AbstractAuthenticationProcessingFilter.doFilter(AbstractAuthenticationProcessingFilter.java:227) at org.springframework.security.web.authentication.AbstractAuthenticationProcessingFilter.doFilter(AbstractAuthenticationProcessingFilter.java:221) at hudson.security.ChainedServletFilter2$1.doFilter(ChainedServletFilter2.java:99) at jenkins.security.BasicHeaderProcessor.doFilter(BasicHeaderProcessor.java:98) at hudson.security.ChainedServletFilter2$1.doFilter(ChainedServletFilter2.java:99) at org.springframework.security.web.context.SecurityContextPersistenceFilter.doFilter(SecurityContextPersistenceFilter.java:117) at org.springframework.security.web.context.SecurityContextPersistenceFilter.doFilter(SecurityContextPersistenceFilter.java:87) at hudson.security.HttpSessionContextIntegrationFilter2.doFilter(HttpSessionContextIntegrationFilter2.java:63) at hudson.security.ChainedServletFilter2$1.doFilter(ChainedServletFilter2.java:99) at hudson.security.ChainedServletFilter2.doFilter(ChainedServletFilter2.java:111) at hudson.security.HudsonFilter.doFilter(HudsonFilter.java:173) at Jenkins Main ClassLoader//org.eclipse.jetty.ee9.servlet.FilterHolder.doFilter(FilterHolder.java:202) at Jenkins Main ClassLoader//org.eclipse.jetty.ee9.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1638) at org.kohsuke.stapler.UncaughtExceptionFilter.doFilter(UncaughtExceptionFilter.java:26) at Jenkins Main ClassLoader//org.eclipse.jetty.ee9.servlet.FilterHolder.doFilter(FilterHolder.java:202) at Jenkins Main ClassLoader//org.eclipse.jetty.ee9.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1638) at hudson.util.CharacterEncodingFilter.doFilter(CharacterEncodingFilter.java:86) at Jenkins Main ClassLoader//org.eclipse.jetty.ee9.servlet.FilterHolder.doFilter(FilterHolder.java:202) at Jenkins Main ClassLoader//org.eclipse.jetty.ee9.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1638) at org.kohsuke.stapler.DiagnosticThreadNameFilter.doFilter(DiagnosticThreadNameFilter.java:31) at Jenkins Main ClassLoader//org.eclipse.jetty.ee9.servlet.FilterHolder.doFilter(FilterHolder.java:202) at Jenkins Main ClassLoader//org.eclipse.jetty.ee9.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1638) at jenkins.security.SuspiciousRequestFilter.doFilter(SuspiciousRequestFilter.java:38) at Jenkins Main ClassLoader//org.eclipse.jetty.ee9.servlet.FilterHolder.doFilter(FilterHolder.java:202) at Jenkins Main ClassLoader//org.eclipse.jetty.ee9.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1638) at Jenkins Main ClassLoader//org.eclipse.jetty.ee9.servlet.ServletHandler.doHandle(ServletHandler.java:526) at Jenkins Main ClassLoader//org.eclipse.jetty.ee9.nested.ScopedHandler.handle(ScopedHandler.java:127) at Jenkins Main ClassLoader//org.eclipse.jetty.ee9.security.SecurityHandler.handle(SecurityHandler.java:574) at Jenkins Main ClassLoader//org.eclipse.jetty.ee9.nested.HandlerWrapper.handle(HandlerWrapper.java:124) at Jenkins Main ClassLoader//org.eclipse.jetty.ee9.nested.ScopedHandler.nextHandle(ScopedHandler.java:197) at Jenkins Main ClassLoader//org.eclipse.jetty.ee9.nested.SessionHandler.doHandle(SessionHandler.java:609) at Jenkins Main ClassLoader//org.eclipse.jetty.ee9.nested.ScopedHandler.nextHandle(ScopedHandler.java:195) at Jenkins Main ClassLoader//org.eclipse.jetty.ee9.nested.ContextHandler.doHandle(ContextHandler.java:1035) at Jenkins Main ClassLoader//org.eclipse.jetty.ee9.nested.ScopedHandler.nextScope(ScopedHandler.java:164) at Jenkins Main ClassLoader//org.eclipse.jetty.ee9.servlet.ServletHandler.doScope(ServletHandler.java:483) at Jenkins Main ClassLoader//org.eclipse.jetty.ee9.nested.ScopedHandler.nextScope(ScopedHandler.java:162) at Jenkins Main ClassLoader//org.eclipse.jetty.ee9.nested.SessionHandler.doScope(SessionHandler.java:586) at Jenkins Main ClassLoader//org.eclipse.jetty.ee9.nested.ScopedHandler.nextScope(ScopedHandler.java:162) at Jenkins Main ClassLoader//org.eclipse.jetty.ee9.nested.ContextHandler.doScope(ContextHandler.java:956) at Jenkins Main ClassLoader//org.eclipse.jetty.ee9.nested.ScopedHandler.handle(ScopedHandler.java:125) at Jenkins Main ClassLoader//org.eclipse.jetty.ee9.nested.ContextHandler.handle(ContextHandler.java:1694) at Jenkins Main ClassLoader//org.eclipse.jetty.ee9.nested.HttpChannel$RequestDispatchable.dispatch(HttpChannel.java:1576) at Jenkins Main ClassLoader//org.eclipse.jetty.ee9.nested.HttpChannel.dispatch(HttpChannel.java:738) at Jenkins Main ClassLoader//org.eclipse.jetty.ee9.nested.HttpChannel.handle(HttpChannel.java:511) at Jenkins Main ClassLoader//org.eclipse.jetty.ee9.nested.ContextHandler$CoreContextHandler$CoreToNestedHandler.handle(ContextHandler.java:2862) at Jenkins Main ClassLoader//org.eclipse.jetty.server.handler.ContextHandler.handle(ContextHandler.java:1060) at Jenkins Main ClassLoader//org.eclipse.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:597) at Jenkins Main ClassLoader//org.eclipse.jetty.server.Server.handle(Server.java:181) at Jenkins Main ClassLoader//org.eclipse.jetty.server.internal.HttpChannelState$HandlerInvoker.run(HttpChannelState.java:661) at Jenkins Main ClassLoader//org.eclipse.jetty.server.internal.HttpConnection.onFillable(HttpConnection.java:406) at Jenkins Main ClassLoader//org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:322) at Jenkins Main ClassLoader//org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:99) at Jenkins Main ClassLoader//org.eclipse.jetty.io.ssl.SslConnection$SslEndPoint.onFillable(SslConnection.java:574) at Jenkins Main ClassLoader//org.eclipse.jetty.io.ssl.SslConnection.onFillable(SslConnection.java:390) at Jenkins Main ClassLoader//org.eclipse.jetty.io.ssl.SslConnection$2.succeeded(SslConnection.java:150) at Jenkins Main ClassLoader//org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:99) at Jenkins Main ClassLoader//org.eclipse.jetty.io.SelectableChannelEndPoint$1.run(SelectableChannelEndPoint.java:53) at Jenkins Main ClassLoader//org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.runTask(AdaptiveExecutionStrategy.java:478) at Jenkins Main ClassLoader//org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.consumeTask(AdaptiveExecutionStrategy.java:441) at Jenkins Main ClassLoader//org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.tryProduce(AdaptiveExecutionStrategy.java:293) at Jenkins Main ClassLoader//org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.run(AdaptiveExecutionStrategy.java:201) at Jenkins Main ClassLoader//org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:311) at Jenkins Main ClassLoader//org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:979) at Jenkins Main ClassLoader//org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.doRunJob(QueuedThreadPool.java:1209) at Jenkins Main ClassLoader//org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1164) at java.base/java.lang.Thread.run(Thread.java:1583)
We're having the same issue. Jenkins will run fine for a while, then lock up and we see the same errors in the logs. We're using Jenkins 2.127, workflow-cps 2.51:
INFO: Running CpsFlowExecutionOwner[git/org/master/146:git/org/master #146] unresponsive for 5 sec
Jul 05, 2018 3:28:36 PM org.jenkinsci.plugins.workflow.support.concurrent.Timeout lambda$ping$0