Loading...

Type: Bug
Resolution: Unresolved
Priority: Major
Component/s: docker-workflow-plugin
Labels:
None
Environment:
Jenkins version 2.289.2
Docker workflow plugin version 1.26
Docker daemon version 20.10.6

After our recent upgrades to newer Jenkins core and docker-work-flow plugins, we started to see that some of our pipelines are failing to kill the docker containers after stage execution is finished and this is happening randomly,

Below is the stack from pipeline log,

[2021-11-09T10:03:36.519Z] java.io.IOException: Failed to kill container '140fb1f0a48df20a1960b9b80a5dce298f997edf4c3cba5bdebaf225af4c5542'.
[2021-11-09T10:03:36.541Z] at org.jenkinsci.plugins.docker.workflow.client.DockerClient.stop(DockerClient.java:184)
[2021-11-09T10:03:36.541Z] at org.jenkinsci.plugins.docker.workflow.WithContainerStep.destroy(WithContainerStep.java:109)
[2021-11-09T10:03:36.542Z] at org.jenkinsci.plugins.docker.workflow.WithContainerStep.access$400(WithContainerStep.java:76)
[2021-11-09T10:03:36.543Z] at org.jenkinsci.plugins.docker.workflow.WithContainerStep$Callback.finished(WithContainerStep.java:391)
[2021-11-09T10:03:36.543Z] at org.jenkinsci.plugins.workflow.steps.BodyExecutionCallback$TailCall.onSuccess(BodyExecutionCallback.java:118)
[2021-11-09T10:03:36.544Z] at org.jenkinsci.plugins.workflow.cps.CpsBodyExecution$SuccessAdapter.receive(CpsBodyExecution.java:377)
[2021-11-09T10:03:36.545Z] at com.cloudbees.groovy.cps.Outcome.resumeFrom(Outcome.java:73)
[2021-11-09T10:03:36.545Z] at com.cloudbees.groovy.cps.Continuable$1.call(Continuable.java:166)
[2021-11-09T10:03:36.546Z] at com.cloudbees.groovy.cps.Continuable$1.call(Continuable.java:163)
[2021-11-09T10:03:36.547Z] at org.codehaus.groovy.runtime.GroovyCategorySupport$ThreadCategoryInfo.use(GroovyCategorySupport.java:129)
[2021-11-09T10:03:36.547Z] at org.codehaus.groovy.runtime.GroovyCategorySupport.use(GroovyCategorySupport.java:268)
[2021-11-09T10:03:36.548Z] at com.cloudbees.groovy.cps.Continuable.run0(Continuable.java:163)
[2021-11-09T10:03:36.549Z] at org.jenkinsci.plugins.workflow.cps.SandboxContinuable.access$001(SandboxContinuable.java:18)
[2021-11-09T10:03:36.549Z] at org.jenkinsci.plugins.workflow.cps.SandboxContinuable.run0(SandboxContinuable.java:51)
[2021-11-09T10:03:36.550Z] at org.jenkinsci.plugins.workflow.cps.CpsThread.runNextChunk(CpsThread.java:185)
[2021-11-09T10:03:36.551Z] at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.run(CpsThreadGroup.java:400)
[2021-11-09T10:03:36.552Z] at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.access$400(CpsThreadGroup.java:96)
[2021-11-09T10:03:36.552Z] at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:312)
[2021-11-09T10:03:36.553Z] at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:276)
[2021-11-09T10:03:36.554Z] at org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService$2.call(CpsVmExecutorService.java:67)
[2021-11-09T10:03:36.554Z] at java.util.concurrent.FutureTask.run(FutureTask.java:266)
[2021-11-09T10:03:36.555Z] at hudson.remoting.SingleLaneExecutorService$1.run(SingleLaneExecutorService.java:139)
[2021-11-09T10:03:36.555Z] at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
[2021-11-09T10:03:36.556Z] at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:68)
[2021-11-09T10:03:36.557Z] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
[2021-11-09T10:03:36.557Z] at java.util.concurrent.FutureTask.run(FutureTask.java:266)
[2021-11-09T10:03:36.558Z] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
[2021-11-09T10:03:36.559Z] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
[2021-11-09T10:03:36.560Z] at java.lang.Thread.run(Thread.java:748)

At the same time below what I found in the docker daemon log,

Nov 9 11:58:05 dockerd: time="2021-11-09T11:58:05.687304030+02:00" level=info msg="Container 140fb1f0a48df20a1960b9b80a5dce298f997edf4c3cba5bdebaf225af4c5542 failed to exit within 1 seconds of signal 15 - using the force"
Nov 9 11:58:07 dockerd: time="2021-11-09T11:58:07.811355435+02:00" level=info msg="ignoring event" container=140fb1f0a48df20a1960b9b80a5dce298f997edf4c3cba5bdebaf225af4c5542 module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
Nov 9 11:58:07 containerd: time="2021-11-09T11:58:07.811596661+02:00" level=info msg="shim disconnected" id=140fb1f0a48df20a1960b9b80a5dce298f997edf4c3cba5bdebaf225af4c5542

Any idea why we get such issue in stopping the container ? Is there a way to increase the time out maybe (daemon log says failed exit in just 1 second) ?

Details

Description

Attachments

Activity

People

Dates