[JENKINS-42322] Docker rm/stop/... commands killed by the timeout, failing builds

Type: Bug
Resolution: Fixed
Priority: Major
Component/s: docker-workflow-plugin
Labels:
- regression
Environment:
Jenkins 2.46
Debian GNU/Linux 7.8 / Docker 1.13.1
Docker worflow plugin 1.10

Similar Issues:
Powered by SuggestiMate

Show

Hi,

I've recently upgraded the docker workflow plugin from 1.8 to 1.10, in 1.8 my pipeline worked perfectly well, It uses 2 external containers and 1 where actions are done into.

In 1.10 I have the following error on the container launched with .inside { } method :

$ docker stop --time=1 6b4b512400884e660dc4cd4eda6e9b3d7c358317f08a1c46399b5253ec7e1b02
$ docker rm -f 6b4b512400884e660dc4cd4eda6e9b3d7c358317f08a1c46399b5253ec7e1b02
ERROR: Timeout after 10 seconds

And the job, fail with

java.io.IOException: Failed to rm container '6b4b512400884e660dc4cd4eda6e9b3d7c358317f08a1c46399b5253ec7e1b02'.
	at org.jenkinsci.plugins.docker.workflow.client.DockerClient.rm(DockerClient.java:158)
	at org.jenkinsci.plugins.docker.workflow.client.DockerClient.stop(DockerClient.java:145)
	at org.jenkinsci.plugins.docker.workflow.WithContainerStep.destroy(WithContainerStep.java:107)
	at org.jenkinsci.plugins.docker.workflow.WithContainerStep.access$400(WithContainerStep.java:74)
	at org.jenkinsci.plugins.docker.workflow.WithContainerStep$Callback.finished(WithContainerStep.java:302)
	at org.jenkinsci.plugins.workflow.steps.BodyExecutionCallback$TailCall.onSuccess(BodyExecutionCallback.java:114)
	at org.jenkinsci.plugins.workflow.cps.CpsBodyExecution$SuccessAdapter.receive(CpsBodyExecution.java:362)
	at com.cloudbees.groovy.cps.Outcome.resumeFrom(Outcome.java:73)
	at com.cloudbees.groovy.cps.Continuable.run0(Continuable.java:146)
	at org.jenkinsci.plugins.workflow.cps.SandboxContinuable.access$001(SandboxContinuable.java:18)
	at org.jenkinsci.plugins.workflow.cps.SandboxContinuable$1.call(SandboxContinuable.java:33)
	at org.jenkinsci.plugins.workflow.cps.SandboxContinuable$1.call(SandboxContinuable.java:30)
	at org.jenkinsci.plugins.scriptsecurity.sandbox.groovy.GroovySandbox.runInSandbox(GroovySandbox.java:108)
	at org.jenkinsci.plugins.workflow.cps.SandboxContinuable.run0(SandboxContinuable.java:30)
	at org.jenkinsci.plugins.workflow.cps.CpsThread.runNextChunk(CpsThread.java:165)
	at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.run(CpsThreadGroup.java:328)
	at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.access$100(CpsThreadGroup.java:80)
	at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:240)
	at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:228)
	at org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService$2.call(CpsVmExecutorService.java:64)
	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
	at hudson.remoting.SingleLaneExecutorService$1.run(SingleLaneExecutorService.java:112)
	at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:745)

This seems to be related with the
https://github.com/jenkinsci/docker-workflow-plugin/pull/65
I tried to downgrade manualy to 1.8 and it works well again (as it does not specify the option --time=1 anymore).

Is there a way to disable this option ?

Thanks,

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

image-2017-06-15-10-44-46-058.png
2017-06-15 14:44
21 kB
Joshua Noble

is duplicated by

JENKINS-44767 Error when fingerprint docker: Cannot retrieve .Id

Resolved

JENKINS-42667 Job fails due to timeout when destroying docker container

Closed

is related to

JENKINS-37719 Build cannot be interrupted if `docker stop` hangs

Resolved

JENKINS-57136 Allow users to customize docker timeouts

Open

relates to

JENKINS-46969 Docker container closes prematurely

Resolved

links to

PR 107

(1 links to)

Spoon added a comment - 2017-03-01 17:35

I see exactly the same timeout. Most often with the "docker rm" step, but also with the "docker stop" and even with Docker starting up!! This effectively makes using docker on a loaded system impractical as you don't know if you will hit the problem.

Please can you provide a way to remove/control the timeout.

Spoon added a comment - 2017-03-01 17:35 I see exactly the same timeout. Most often with the "docker rm" step, but also with the "docker stop" and even with Docker starting up!! This effectively makes using docker on a loaded system impractical as you don't know if you will hit the problem. Please can you provide a way to remove/control the timeout.

Spoon added a comment - 2017-03-01 17:56 - edited

This looks to be the result of the fix for https://issues.jenkins-ci.org/browse/JENKINS-42322.

While I understand the need to have some timeout, 10 seconds is short.

~~JENKINS-42322~~ mentions:

(All Docker commands in the plugin that are expected to take a long time, because they might contact a registry, are run from sh steps in Groovy code so they are durable and cleanly interruptible. Docker commands which we do not need to wait for, like docker exec to run processes in the container, are just launched and the process handle discarded. DockerClient with its blocking join call is only used for commands which under normal conditions should be close to instantaneous: docker run -d after checking for local availability of the image, docker inspect, docker stop, etc. But clearly this assumption is not completely reliable.

The suggested workaround to use sh is far from ideal when you are used to the lovely integration offered by the docker-workflow-plugin.

Could something like this be possible?

docker.image('image').inside("-u fred").timeout(60) {
...
}

Spoon added a comment - 2017-03-01 17:56 - edited This looks to be the result of the fix for https://issues.jenkins-ci.org/browse/JENKINS-42322 . While I understand the need to have some timeout, 10 seconds is short. JENKINS-42322 mentions: (All Docker commands in the plugin that are expected to take a long time, because they might contact a registry, are run from sh steps in Groovy code so they are durable and cleanly interruptible. Docker commands which we do not need to wait for, like docker exec to run processes in the container, are just launched and the process handle discarded. DockerClient with its blocking join call is only used for commands which under normal conditions should be close to instantaneous: docker run -d after checking for local availability of the image, docker inspect, docker stop, etc. But clearly this assumption is not completely reliable. The suggested workaround to use sh is far from ideal when you are used to the lovely integration offered by the docker-workflow-plugin. Could something like this be possible? docker.image('image').inside("-u fred").timeout(60) { ... }

Nazim Sitmanbetov added a comment - 2017-03-10 18:01

Suffering from the same issue. The worst thing about it is that it's hard to reproduce. What I observe is whenever build machine under high load, most probably jenkins docker plugin timeouts during `docker rm -f ...` step.

Still don't understand why `docker rm -f ...` times out. Under the hood it should be just SIGKILL command for corresponding process, but I'm not sure.

Was anyone able to make some workaround, or even have idea how to fix this?

Nazim Sitmanbetov added a comment - 2017-03-10 18:01 Suffering from the same issue. The worst thing about it is that it's hard to reproduce. What I observe is whenever build machine under high load, most probably jenkins docker plugin timeouts during `docker rm -f ...` step. Still don't understand why `docker rm -f ...` times out. Under the hood it should be just SIGKILL command for corresponding process, but I'm not sure. Was anyone able to make some workaround, or even have idea how to fix this?

Filip Pytloun added a comment - 2017-03-13 09:37 - edited

Unfortunately, we are also affected by this issue and the only workaround is downgrade.

Timeout for stop probably wouldn't be an issue but the problem is that it also affects removal of container - which may take a long time if container layer is big (eg. when there was installed a lot of packages during build) and/or storage is not fast enough.

Filip Pytloun added a comment - 2017-03-13 09:37 - edited Unfortunately, we are also affected by this issue and the only workaround is downgrade. Timeout for stop probably wouldn't be an issue but the problem is that it also affects removal of container - which may take a long time if container layer is big (eg. when there was installed a lot of packages during build) and/or storage is not fast enough.

Hugh Saunders added a comment - 2017-03-17 15:17

Also seeing this, downgrading is complicated by other plugins requiring at least v1.9 of this plugin. Trying a longer timeout. https://github.com/jenkinsci/docker-workflow-plugin/compare/master...hughsaunders:remove_timeout

Hugh Saunders added a comment - 2017-03-17 15:17 Also seeing this, downgrading is complicated by other plugins requiring at least v1.9 of this plugin. Trying a longer timeout. https://github.com/jenkinsci/docker-workflow-plugin/compare/master...hughsaunders:remove_timeout

Spoon added a comment - 2017-03-20 11:57

Seems like a good first step hughsaunders. The only future enhancement would be to control the timeout somehow.

Spoon added a comment - 2017-03-20 11:57 Seems like a good first step hughsaunders . The only future enhancement would be to control the timeout somehow.

Yuriy R added a comment - 2017-03-27 11:16 - edited

Hi,

I have similar issue after upgrading docker-workflow-plugin from 1.9.1 to 1.10:

[Stream 0] $ docker stop --time=1 ca34a5e714f2cc1049afd3bea0a53356ef1ee60d69a29b0648502f93f3796058
[Stream 0] ERROR: Timeout after 10 seconds
[Stream 0] [Pipeline] [Stream 0] // withDockerContainer
[Pipeline] [Stream 0] }
[Pipeline] [Stream 0] // dir
[Pipeline] [Stream 0] }
[Pipeline] [Stream 0] // dir
[Pipeline] [Stream 0] }
[Stream 0] Failed in branch Stream 0

And it works fine after downgrade to 1.9.1
At least make timeout parameter customizable.

thank you.

Yuriy R added a comment - 2017-03-27 11:16 - edited Hi, I have similar issue after upgrading docker-workflow-plugin from 1.9.1 to 1.10: [Stream 0] $ docker stop --time=1 ca34a5e714f2cc1049afd3bea0a53356ef1ee60d69a29b0648502f93f3796058 [Stream 0] ERROR: Timeout after 10 seconds [Stream 0] [Pipeline] [Stream 0] // withDockerContainer [Pipeline] [Stream 0] } [Pipeline] [Stream 0] // dir [Pipeline] [Stream 0] } [Pipeline] [Stream 0] // dir [Pipeline] [Stream 0] } [Stream 0] Failed in branch Stream 0 And it works fine after downgrade to 1.9.1 At least make timeout parameter customizable. thank you.

Kevin Yu added a comment - 2017-03-30 18:44

Basically i'm seeing the exact same symptom as described by Yuriy. Downgraded to 1.9.1 and so far I've not seen this issue

Kevin Yu added a comment - 2017-03-30 18:44 Basically i'm seeing the exact same symptom as described by Yuriy. Downgraded to 1.9.1 and so far I've not seen this issue

Edmund Haselwanter added a comment - 2017-04-21 06:25 - edited

is there any workaround (like a setting in the pipeline) ?

Edmund Haselwanter added a comment - 2017-04-21 06:25 - edited is there any workaround (like a setting in the pipeline) ?

Kyle Wayman added a comment - 2017-05-03 01:25

I'm seeing the same timeout issue, running 1.10. I noticed that even though the error says:

ERROR: Timeout after 10 seconds

The command is actually only allowing for 1 second:

$ docker stop --time=1 ...

That 1 second time was added here:

https://github.com/jenkinsci/docker-workflow-plugin/commit/0fed7a702751f743eb3603092e1731f8979930a5

jglick Was that, perhaps, suppose to be 10 seconds?

Kyle Wayman added a comment - 2017-05-03 01:25 I'm seeing the same timeout issue, running 1.10. I noticed that even though the error says: ERROR: Timeout after 10 seconds The command is actually only allowing for 1 second: $ docker stop --time=1 ... That 1 second time was added here: https://github.com/jenkinsci/docker-workflow-plugin/commit/0fed7a702751f743eb3603092e1731f8979930a5 jglick Was that, perhaps, suppose to be 10 seconds?

Jesse Glick added a comment - 2017-05-03 16:42

No, --time=1 is intentional, and unrelated to this issue which works around ~~JENKINS-37719~~ (otherwise it would be much worse: an indefinite hang in Jenkins). At root is some problem in the Docker daemon I think.

Jesse Glick added a comment - 2017-05-03 16:42 No, --time=1 is intentional, and unrelated to this issue which works around JENKINS-37719 (otherwise it would be much worse: an indefinite hang in Jenkins). At root is some problem in the Docker daemon I think.

Oded Arbel added a comment - 2017-05-08 07:04

Same issue here - a long running build with a lot of produced artifacts times out in the `docker rm` command. Downgrading to 1.9.1 worksaround the problem.

Oded Arbel added a comment - 2017-05-08 07:04 Same issue here - a long running build with a lot of produced artifacts times out in the `docker rm` command. Downgrading to 1.9.1 worksaround the problem.

Maxim Leonovich added a comment - 2017-05-08 23:30

Same here. Running a heavy build inside a docker container. The build is successful, however, the step fails due to docker kill timeout. I assume it succeeds after some time as I don't see any stuck containers when I check `docker ps` on the node. Would be great if we could configure the timeout.

Maxim Leonovich added a comment - 2017-05-08 23:30 Same here. Running a heavy build inside a docker container. The build is successful, however, the step fails due to docker kill timeout. I assume it succeeds after some time as I don't see any stuck containers when I check `docker ps` on the node. Would be great if we could configure the timeout.

Michael Rose added a comment - 2017-05-11 17:15 - edited

Does this issue also apply to docker run?

00:57:02.191 $ docker run -t -d -u 12347:12347 -u jenkins -e USER=jenkins -e LOGNAME=jenkins -v /home/jenkins/.p4tickets:/home/jenkins/.p4tickets:ro -w /jenkins/workspace/reviews/ensite -v /jenkins/workspace/reviews/ensite:/jenkins/workspace/reviews/ensite:rw -v /jenkins/workspace/reviews/ensite.tmp:/jenkins/workspace/reviews/ensite.tmp:rw -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** --entrypoint cat docker.csdt.sjm.com/build_servers/ensite/rel/velocity_6.0*00:57:12.197* ERROR: Timeout after 10 seconds*00:57:12.450* [Pipeline] // withDockerContainer

Here is the exception:

java.io.IOException: Failed to run image 'docker.csdt.sjm.com/build_servers/ensite/rel/velocity_6.0'. Error:
at org.jenkinsci.plugins.docker.workflow.client.DockerClient.run(DockerClient.java:128)
at org.jenkinsci.plugins.docker.workflow.WithContainerStep$Execution.start(WithContainerStep.java:179)
at org.jenkinsci.plugins.workflow.cps.DSL.invokeStep(DSL.java:184)
at org.jenkinsci.plugins.workflow.cps.DSL.invokeMethod(DSL.java:126)
at org.jenkinsci.plugins.workflow.cps.CpsScript.invokeMethod(CpsScript.java:108)
at org.codehaus.groovy.runtime.callsite.PogoMetaClassSite.call(PogoMetaClassSite.java:48)
at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:48)
at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:113)
at com.cloudbees.groovy.cps.sandbox.DefaultInvoker.methodCall(DefaultInvoker.java:18)
at org.jenkinsci.plugins.docker.workflow.Docker$Image.inside(jar:file:/var/lib/jenkins/plugins/docker-workflow/WEB-INF/lib/docker-workflow.jar!/org/jenkinsci/plugins/docker/workflow/Docker.groovy:128)
at org.jenkinsci.plugins.docker.workflow.Docker.node(jar:file:/var/lib/jenkins/plugins/docker-workflow/WEB-INF/lib/docker-workflow.jar!/org/jenkinsci/plugins/docker/workflow/Docker.groovy:63)
at org.jenkinsci.plugins.docker.workflow.Docker$Image.inside(jar:file:/var/lib/jenkins/plugins/docker-workflow/WEB-INF/lib/docker-workflow.jar!/org/jenkinsci/plugins/docker/workflow/Docker.groovy:116)
at Script1._worflow_steps(Script1.groovy:680)
at __cps.transform__(Native Method)
at com.cloudbees.groovy.cps.impl.ContinuationGroup.methodCall(ContinuationGroup.java:57)
at com.cloudbees.groovy.cps.impl.FunctionCallBlock$ContinuationImpl.dispatchOrArg(FunctionCallBlock.java:109)
at com.cloudbees.groovy.cps.impl.FunctionCallBlock$ContinuationImpl.fixArg(FunctionCallBlock.java:82)
at sun.reflect.GeneratedMethodAccessor403.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at com.cloudbees.groovy.cps.impl.ContinuationPtr$ContinuationImpl.receive(ContinuationPtr.java:72)
at com.cloudbees.groovy.cps.impl.ClosureBlock.eval(ClosureBlock.java:46)
at com.cloudbees.groovy.cps.Next.step(Next.java:74)
at com.cloudbees.groovy.cps.Continuable.run0(Continuable.java:154)
at org.jenkinsci.plugins.workflow.cps.CpsThread.runNextChunk(CpsThread.java:165)
at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.run(CpsThreadGroup.java:330)
at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.access$100(CpsThreadGroup.java:82)
at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:242)
at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:230)
at org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService$2.call(CpsVmExecutorService.java:64)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at hudson.remoting.SingleLaneExecutorService$1.run(SingleLaneExecutorService.java:112)
at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)

This part of the pipeline build was sitting in the queue for about 45 minutes. I'm not sure if being in the queue would make it count as a "long running build" though ...

Michael Rose added a comment - 2017-05-11 17:15 - edited Does this issue also apply to docker run? 00:57:02.191 $ docker run -t -d -u 12347:12347 -u jenkins -e USER=jenkins -e LOGNAME=jenkins -v /home/jenkins/.p4tickets:/home/jenkins/.p4tickets:ro -w /jenkins/workspace/reviews/ensite -v /jenkins/workspace/reviews/ensite:/jenkins/workspace/reviews/ensite:rw -v /jenkins/workspace/reviews/ensite.tmp:/jenkins/workspace/reviews/ensite.tmp:rw -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** --entrypoint cat docker.csdt.sjm.com/build_servers/ensite/rel/velocity_6.0*00:57:12.197* ERROR: Timeout after 10 seconds*00:57:12.450* [Pipeline] // withDockerContainer Here is the exception: java.io.IOException: Failed to run image 'docker.csdt.sjm.com/build_servers/ensite/rel/velocity_6.0'. Error: at org.jenkinsci.plugins.docker.workflow.client.DockerClient.run(DockerClient.java:128) at org.jenkinsci.plugins.docker.workflow.WithContainerStep$Execution.start(WithContainerStep.java:179) at org.jenkinsci.plugins.workflow.cps.DSL.invokeStep(DSL.java:184) at org.jenkinsci.plugins.workflow.cps.DSL.invokeMethod(DSL.java:126) at org.jenkinsci.plugins.workflow.cps.CpsScript.invokeMethod(CpsScript.java:108) at org.codehaus.groovy.runtime.callsite.PogoMetaClassSite.call(PogoMetaClassSite.java:48) at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:48) at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:113) at com.cloudbees.groovy.cps.sandbox.DefaultInvoker.methodCall(DefaultInvoker.java:18) at org.jenkinsci.plugins.docker.workflow.Docker$Image.inside(jar: file:/var/lib/jenkins/plugins/docker-workflow/WEB-INF/lib/docker-workflow.jar!/org/jenkinsci/plugins/docker/workflow/Docker.groovy:128 ) at org.jenkinsci.plugins.docker.workflow.Docker.node(jar: file:/var/lib/jenkins/plugins/docker-workflow/WEB-INF/lib/docker-workflow.jar!/org/jenkinsci/plugins/docker/workflow/Docker.groovy:63 ) at org.jenkinsci.plugins.docker.workflow.Docker$Image.inside(jar: file:/var/lib/jenkins/plugins/docker-workflow/WEB-INF/lib/docker-workflow.jar!/org/jenkinsci/plugins/docker/workflow/Docker.groovy:116 ) at Script1._worflow_steps(Script1.groovy:680) at __ cps.transform __(Native Method) at com.cloudbees.groovy.cps.impl.ContinuationGroup.methodCall(ContinuationGroup.java:57) at com.cloudbees.groovy.cps.impl.FunctionCallBlock$ContinuationImpl.dispatchOrArg(FunctionCallBlock.java:109) at com.cloudbees.groovy.cps.impl.FunctionCallBlock$ContinuationImpl.fixArg(FunctionCallBlock.java:82) at sun.reflect.GeneratedMethodAccessor403.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at com.cloudbees.groovy.cps.impl.ContinuationPtr$ContinuationImpl.receive(ContinuationPtr.java:72) at com.cloudbees.groovy.cps.impl.ClosureBlock.eval(ClosureBlock.java:46) at com.cloudbees.groovy.cps.Next.step(Next.java:74) at com.cloudbees.groovy.cps.Continuable.run0(Continuable.java:154) at org.jenkinsci.plugins.workflow.cps.CpsThread.runNextChunk(CpsThread.java:165) at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.run(CpsThreadGroup.java:330) at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.access$100(CpsThreadGroup.java:82) at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:242) at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:230) at org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService$2.call(CpsVmExecutorService.java:64) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at hudson.remoting.SingleLaneExecutorService$1.run(SingleLaneExecutorService.java:112) at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:748) This part of the pipeline build was sitting in the queue for about 45 minutes. I'm not sure if being in the queue would make it count as a "long running build" though ...

Joshua Noble added a comment - 2017-05-11 18:12

I'm seeing this as well. 10 seconds is far too low.

Joshua Noble added a comment - 2017-05-11 18:12 I'm seeing this as well. 10 seconds is far too low.

Joshua Noble added a comment - 2017-05-15 22:33

10 seconds is far too low. I've opened a PR here: https://github.com/jenkinsci/docker-workflow-plugin/pull/100

Joshua Noble added a comment - 2017-05-15 22:33 10 seconds is far too low. I've opened a PR here: https://github.com/jenkinsci/docker-workflow-plugin/pull/100

Joshua Noble added a comment - 2017-05-16 21:01

I couldn't wait for this to be merged so I've built a SNAPSHOT version of it locally and I'm currently testing it on my Jenkins 2 cluster. I've attached the built plugin here in case others want to use it. It's based on commit 0b5078d985f30dcb0beb81fd2f3c3145973d6cb8 of my fork. https://github.com/acejam/docker-workflow-plugin/commit/0b5078d985f30dcb0beb81fd2f3c3145973d6cb8

https://github.com/jenkinsci/docker-workflow-plugin/files/1005811/docker-workflow.hpi.zip

Joshua Noble added a comment - 2017-05-16 21:01 I couldn't wait for this to be merged so I've built a SNAPSHOT version of it locally and I'm currently testing it on my Jenkins 2 cluster. I've attached the built plugin here in case others want to use it. It's based on commit 0b5078d985f30dcb0beb81fd2f3c3145973d6cb8 of my fork. https://github.com/acejam/docker-workflow-plugin/commit/0b5078d985f30dcb0beb81fd2f3c3145973d6cb8 https://github.com/jenkinsci/docker-workflow-plugin/files/1005811/docker-workflow.hpi.zip

Pavel Georgiev added a comment - 2017-06-07 11:43

Anyone working on it? I tried different workarounds with retry{}, try /catch but i always hit different unacceptable side effects....

Pavel Georgiev added a comment - 2017-06-07 11:43 Anyone working on it? I tried different workarounds with retry{}, try /catch but i always hit different unacceptable side effects....

Wilfried Goesgens added a comment - 2017-06-07 12:28

It seems to be dependend on the size of your docker image whether this occurs or not.

Our regular build-images (~<600MB, Compiler + cmake aboard) never ran into timeouts so far.

However, now that we also use an image to build our Documentation (1.8G, Compiler, nodejs, calibre, java, X11 and so on aboard) this seems to occur now and then.

REPOSITORY TAG IMAGE ID CREATED SIZE

arangodb/documentation-builder latest 653714cd7afe 25 hours ago 1.848 GB

debianjessie/build latest ab58cd344649 5 weeks ago 642.9 MB

Wilfried Goesgens added a comment - 2017-06-07 12:28 It seems to be dependend on the size of your docker image whether this occurs or not. Our regular build -images (~<600MB, Compiler + cmake aboard) never ran into timeouts so far. However, now that we also use an image to build our Documentation (1.8G, Compiler, nodejs, calibre, java, X11 and so on aboard) this seems to occur now and then. REPOSITORY TAG IMAGE ID CREATED SIZE arangodb/documentation-builder latest 653714cd7afe 25 hours ago 1.848 GB debianjessie/build latest ab58cd344649 5 weeks ago 642.9 MB

Pavel Georgiev added a comment - 2017-06-07 13:33

builder-centos-7.3.1611-maven-3.5-jdk-8 2.0.0 04a9b05d9427 20 hours ago 781 MB

This is mine and its > 600 MB. But i need maven, jdk and some additional packages so i'm not sure i go lower.

Also it would be nice to also provide the container id somehow so that we can manually try to stop/rm the container if something happens. And i see this pretty often (like every 2nd/3rd build on a loaded server of course)

Pavel Georgiev added a comment - 2017-06-07 13:33 builder-centos-7.3.1611-maven-3.5-jdk-8 2.0.0 04a9b05d9427 20 hours ago 781 MB This is mine and its > 600 MB. But i need maven, jdk and some additional packages so i'm not sure i go lower . Also it would be nice to also provide the container id somehow so that we can manually try to stop/rm the container if something happens. And i see this pretty often (like every 2nd/3rd build on a loaded server of course)

Joshua Noble added a comment - 2017-06-07 13:50 - edited

In my experience this relates to two important items:

Image design
Current resource utilization of the machine

First, let's talk about Docker image design. There are plenty of Docker images out there that do not handle SIGTERM or SIGINT signals properly. This means if you have an image open in your terminal, and hit Ctrl + C, it won't exit immediately. This also means calling "docker stop" sometimes take a while, because the image won't handle the signal from Docker very well, and thus cannot gracefully exit. Docker must forcefully stop the image instead.

Machine resource utilization also plays a large factor here. For example, if I had a Jenkins build agent with 4 executor slots, only one of those slots may be used for my current pipeline. However at any given time, the 3 remaining slots could very well be busy chugging away running other pipelines and/or Docker images. CPU load is now higher and more memory is being used. As a result, the Docker daemon will sometimes take a few seconds just to process the simple "docker stop" command that Jenkins is sending. Anyone can easily test this theory on their local machine by running a few Docker images that actually "do stuff". Try compiling 3 different apps in 3 different images all at the same time and see what happens. You don't even need Jenkins installed.

Regardless of the two reasons above, the important thing to note here is that these "issues" are all due to the Docker daemon, and don't really have any bearing on Jenkins itself. In other words, this Jenkins plugin is simply wrapping the Docker CLI. Due to this, we must bump the timeout to a higher than 10 second value, like I've done in my PR here: https://github.com/jenkinsci/docker-workflow-plugin/pull/100

The unfortunate situation many are facing right now is that you're entire build may truly be passing, but it will end up failing simply because Docker isn't given a chance to clean up after itself, because Jenkins cuts off after giving Docker only 10 seconds. I've been running a snapshot version of this plugin with my fix from the linked PR above for 3 weeks now, and I've experienced ZERO timeouts thus far.

You can find the snapshot version of my fix here: https://github.com/jenkinsci/docker-workflow-plugin/files/1005811/docker-workflow.hpi.zip

Joshua Noble added a comment - 2017-06-07 13:50 - edited In my experience this relates to two important items: Image design Current resource utilization of the machine First, let's talk about Docker image design. There are plenty of Docker images out there that do not handle SIGTERM or SIGINT signals properly. This means if you have an image open in your terminal, and hit Ctrl + C, it won't exit immediately. This also means calling "docker stop" sometimes take a while, because the image won't handle the signal from Docker very well, and thus cannot gracefully exit. Docker must forcefully stop the image instead. Machine resource utilization also plays a large factor here. For example, if I had a Jenkins build agent with 4 executor slots, only one of those slots may be used for my current pipeline. However at any given time, the 3 remaining slots could very well be busy chugging away running other pipelines and/or Docker images. CPU load is now higher and more memory is being used. As a result, the Docker daemon will sometimes take a few seconds just to process the simple "docker stop" command that Jenkins is sending. Anyone can easily test this theory on their local machine by running a few Docker images that actually "do stuff". Try compiling 3 different apps in 3 different images all at the same time and see what happens. You don't even need Jenkins installed. Regardless of the two reasons above, the important thing to note here is that these "issues" are all due to the Docker daemon, and don't really have any bearing on Jenkins itself. In other words, this Jenkins plugin is simply wrapping the Docker CLI. Due to this, we must bump the timeout to a higher than 10 second value, like I've done in my PR here: https://github.com/jenkinsci/docker-workflow-plugin/pull/100 The unfortunate situation many are facing right now is that you're entire build may truly be passing, but it will end up failing simply because Docker isn't given a chance to clean up after itself, because Jenkins cuts off after giving Docker only 10 seconds. I've been running a snapshot version of this plugin with my fix from the linked PR above for 3 weeks now, and I've experienced ZERO timeouts thus far. You can find the snapshot version of my fix here: https://github.com/jenkinsci/docker-workflow-plugin/files/1005811/docker-workflow.hpi.zip

Wilfried Goesgens added a comment - 2017-06-08 17:13

Hi,

my image in question is built from:

https://github.com/arangodb-helper/build-docker-containers/blob/master/distros/debian/jessie.docu/build/Dockerfile

and its reachable via the docker library via the name I've posted above.

But yes, Joshua is right, typically a `make -j 8` would be running on the same host, so the system is definitely busy.

Wilfried Goesgens added a comment - 2017-06-08 17:13 Hi, my image in question is built from: https://github.com/arangodb-helper/build-docker-containers/blob/master/distros/debian/jessie.docu/build/Dockerfile and its reachable via the docker library via the name I've posted above. But yes, Joshua is right, typically a `make -j 8` would be running on the same host, so the system is definitely busy.

Daniel Oberdick added a comment - 2017-06-09 06:38

Greetings,

i tried the snapshot (org.jenkins-ci.plugins:docker-workflow:1.12-SNAPSHOT) but still my container get killed after 10 seconds.

Also what i don't understand, I'm running docker run from inside a shell executer, why is this container killed by the docker workflow plugin?

R/Daniel

Daniel Oberdick added a comment - 2017-06-09 06:38 Greetings, i tried the snapshot (org.jenkins-ci.plugins:docker-workflow:1.12-SNAPSHOT) but still my container get killed after 10 seconds. Also what i don't understand, I'm running docker run from inside a shell executer, why is this container killed by the docker workflow plugin? R/Daniel

Pavel Georgiev added a comment - 2017-06-15 12:25

Any hints on whether this will be fixed soon? I cannot fully utilize the plugin with this behavior.

Pavel Georgiev added a comment - 2017-06-15 12:25 Any hints on whether this will be fixed soon? I cannot fully utilize the plugin with this behavior.

Joshua Noble added a comment - 2017-06-15 14:45

tubesenf Are you sure it's installed correctly? Make sure you see this on the Installed tab of the Update Center/Plugin Manager page:

Joshua Noble added a comment - 2017-06-15 14:45 tubesenf Are you sure it's installed correctly? Make sure you see this on the Installed tab of the Update Center/Plugin Manager page:

Joshua Noble added a comment - 2017-06-15 14:46 - edited

pgeorgiev Try the snapshot. I don't know why this hasn't been merged yet, but my pull request has been open for a month now. It sounds like the Jenkins organization needs more maintainers. (Hint hint jglick, this is my attempt at volunteering)

Joshua Noble added a comment - 2017-06-15 14:46 - edited pgeorgiev Try the snapshot. I don't know why this hasn't been merged yet, but my pull request has been open for a month now. It sounds like the Jenkins organization needs more maintainers. (Hint hint jglick , this is my attempt at volunteering)

SCM/JIRA link daemon added a comment - 2017-06-16 13:27

Code changed in jenkins
User: Jesse Glick
Path:
src/main/java/org/jenkinsci/plugins/docker/workflow/WithContainerStep.java
src/main/java/org/jenkinsci/plugins/docker/workflow/client/DockerClient.java
src/test/java/org/jenkinsci/plugins/docker/workflow/DockerTestUtil.java
src/test/java/org/jenkinsci/plugins/docker/workflow/WithContainerStepTest.java
http://jenkins-ci.org/commit/docker-workflow-plugin/8c18b064335c3aafff770aa19f2e40c117c910c5
Log:
[FIXED JENKINS-42322] Increased default timeout for docker client operations from 10s to 3m, and making it configurable.

SCM/JIRA link daemon added a comment - 2017-06-16 13:27 Code changed in jenkins User: Jesse Glick Path: src/main/java/org/jenkinsci/plugins/docker/workflow/WithContainerStep.java src/main/java/org/jenkinsci/plugins/docker/workflow/client/DockerClient.java src/test/java/org/jenkinsci/plugins/docker/workflow/DockerTestUtil.java src/test/java/org/jenkinsci/plugins/docker/workflow/WithContainerStepTest.java http://jenkins-ci.org/commit/docker-workflow-plugin/8c18b064335c3aafff770aa19f2e40c117c910c5 Log: [FIXED JENKINS-42322] Increased default timeout for docker client operations from 10s to 3m, and making it configurable.

Pavel Georgiev added a comment - 2017-06-18 10:55

Thanks for fixing this. How can i configure the limit?

Pavel Georgiev added a comment - 2017-06-18 10:55 Thanks for fixing this. How can i configure the limit?

Jesse Glick added a comment - 2017-06-18 15:23

-Dorg.jenkinsci.plugins.docker.workflow.client.DockerClient.CLIENT_TIMEOUT=250 (anything around 300 or more will probably not work, and anyway if your Docker daemon is that broken the plugin would need to be largely redesigned to accommodate your use case)

Jesse Glick added a comment - 2017-06-18 15:23 -Dorg.jenkinsci.plugins.docker.workflow.client.DockerClient.CLIENT_TIMEOUT=250 (anything around 300 or more will probably not work, and anyway if your Docker daemon is that broken the plugin would need to be largely redesigned to accommodate your use case)

Michael Rose added a comment - 2017-06-19 15:14

jglick: Could that option be added to individual agents or is it something that needs to be set on the master?

Michael Rose added a comment - 2017-06-19 15:14 jglick : Could that option be added to individual agents or is it something that needs to be set on the master?

Jesse Glick added a comment - 2017-06-19 16:05

Nope it is an option on the master.

Jesse Glick added a comment - 2017-06-19 16:05 Nope it is an option on the master.

Cristian Magherusan-Stanciu added a comment - 2017-09-13 14:19 - edited

It would be nice to have this configurable from within the pipeline DSL, on many installations it's not so easy to add command-line parameters to the Jenkins process.

We just noticed this on an older version of the plugin after our jobs used to time out after one second, which is a very aggressive default setting in the first place:

docker.withRegistry("$\{registry_url}", "$\{docker_creds_id}") \{
 def img = docker.image('foo/bar:latest')
img.pull()
img.inside('-u root')\{
 sh """
 echo foo  # this works, as well as anything returning in less than 1second

sleep 10 # this times out after a second and gets killed

"""
   }
}

The console output shows this:

the container is executed in the background with -d:

$ docker run -t -d -u root -w /var/lib/jenkins-slave/workspace/ [...]

then it gets killed after a short time:

+ sleep 10
$ docker stop --time=1 eed9124cc21d7c78fe6629475e073e41ec02fa63c531be2a12c4e5d16579b36d
$ docker rm -f eed9124cc21d7c78fe6629475e073e41ec02fa63c531be2a12c4e5d16579b36d
...
hudson.AbortException: script returned exit code -1

Cristian Magherusan-Stanciu added a comment - 2017-09-13 14:19 - edited It would be nice to have this configurable from within the pipeline DSL, on many installations it's not so easy to add command-line parameters to the Jenkins process. We just noticed this on an older version of the plugin after our jobs used to time out after one second, which is a very aggressive default setting in the first place: docker.withRegistry( "$\{registry_url}" , "$\{docker_creds_id}" ) \{ def img = docker.image( 'foo/bar:latest' ) img.pull() img.inside( '-u root' )\{ sh """ echo foo # this works, as well as anything returning in less than 1second sleep 10 # this times out after a second and gets killed """ } } The console output shows this: the container is executed in the background with -d: $ docker run -t -d -u root -w / var /lib/jenkins-slave/workspace/ [...] then it gets killed after a short time: + sleep 10 $ docker stop --time=1 eed9124cc21d7c78fe6629475e073e41ec02fa63c531be2a12c4e5d16579b36d $ docker rm -f eed9124cc21d7c78fe6629475e073e41ec02fa63c531be2a12c4e5d16579b36d ... hudson.AbortException: script returned exit code -1

Jesse Glick added a comment - 2017-09-18 21:20

alien again if you are seeing any timeout applicable to the body then something else is very wrong. Probably you should stop using inside.

Jesse Glick added a comment - 2017-09-18 21:20 alien again if you are seeing any timeout applicable to the body then something else is very wrong. Probably you should stop using inside .

Benjamin Henrion added a comment - 2018-02-05 14:01

The issue still appears for me with 1.14.

I could now reproduce the issue by creating some load on the machine with the "stress" cli tool (extracted from stress_1.0.4-2_amd64.deb)

$ while true; do ./stress --cpu 4 --i25 --vm 4 --vm-bytes 7G; done

I ran simple pipeline job to do helloworld, and the issue appears.

Plus I cannot find a place to add "-Dorg.jenkinsci.plugins.docker.workflow.client.DockerClient.CLIENT_TIMEOUT=250" in ManageJenkins, any idea where I have to add that?

Benjamin Henrion added a comment - 2018-02-05 14:01 The issue still appears for me with 1.14. I could now reproduce the issue by creating some load on the machine with the "stress" cli tool (extracted from stress_1.0.4-2_amd64.deb) $ while true; do ./stress --cpu 4 --i25 --vm 4 --vm-bytes 7G; done I ran simple pipeline job to do helloworld, and the issue appears. Plus I cannot find a place to add " -Dorg.jenkinsci.plugins.docker.workflow.client.DockerClient.CLIENT_TIMEOUT=250 " in ManageJenkins, any idea where I have to add that?

Benjamin Henrion added a comment - 2018-02-07 08:45

Sorry, I made a mistake, the right command is:

$ while true; do ./stress --cpu 4 --io 25 --vm 4 --vm-bytes 7G; done

You can then monitor the response time of the docker daemon with:

$ while true; do time docker ps 2&>1 ; sleep 1; done

real 0m0.128s
user 0m0.006s
sys 0m0.006s

real 0m0.068s
user 0m0.007s
sys 0m0.005s

real 0m0.039s
user 0m0.009s
sys 0m0.003s

real 0m1.794s
user 0m0.009s
sys 0m0.025s

real 0m1.326s
user 0m0.006s
sys 0m0.063s
^C
real 0m1.325s
user 0m0.003s
sys 0m0.044s

Benjamin Henrion added a comment - 2018-02-07 08:45 Sorry, I made a mistake, the right command is: $ while true; do ./stress --cpu 4 --io 25 --vm 4 --vm-bytes 7G; done You can then monitor the response time of the docker daemon with: $ while true; do time docker ps 2&>1 ; sleep 1; done real 0m0.128s user 0m0.006s sys 0m0.006s real 0m0.068s user 0m0.007s sys 0m0.005s real 0m0.039s user 0m0.009s sys 0m0.003s real 0m1.794s user 0m0.009s sys 0m0.025s real 0m1.326s user 0m0.006s sys 0m0.063s ^C real 0m1.325s user 0m0.003s sys 0m0.044s

Benjamin Henrion added a comment - 2018-02-07 08:52

This one is better is measure the response time:

$ while true; do { time docker ps >/dev/null; } 2>&1 | grep real; sleep 2; done

real 0m0.097s
real 0m0.121s
real 0m0.112s
real 0m0.116s
real 0m0.276s
real 0m7.591s
real 0m7.747s

Benjamin Henrion added a comment - 2018-02-07 08:52 This one is better is measure the response time: $ while true; do { time docker ps >/dev/null; } 2>&1 | grep real; sleep 2; done real 0m0.097s real 0m0.121s real 0m0.112s real 0m0.116s real 0m0.276s real 0m7.591s real 0m7.747s

Benjamin Henrion added a comment - 2018-02-07 10:43

This is my jenkins in a docker-compose that works, and the environment JAVA_OPTS has to be modified to load the params:

    jenkins:
        image: myregistry.com/jenkins:2.83:latest
        command: --prefix="/jenkins"
        user: root
        ports:
            - "8081:8080"
            - "8050:50000"
        volumes:
            - jenkins-dv:/var/jenkins_home
        environment:
            - JAVA_OPTS=-Dorg.jenkinsci.plugins.docker.workflow.client.DockerClient.CLIENT_TIMEOUT=240

Benjamin Henrion added a comment - 2018-02-07 10:43 This is my jenkins in a docker-compose that works, and the environment JAVA_OPTS has to be modified to load the params: jenkins: image: myregistry.com/jenkins:2.83:latest command: --prefix="/jenkins" user: root ports: - "8081:8080" - "8050:50000" volumes: - jenkins-dv:/var/jenkins_home environment: - JAVA_OPTS=-Dorg.jenkinsci.plugins.docker.workflow.client.DockerClient.CLIENT_TIMEOUT=240

Gustavo Chaves added a comment - 2018-06-17 20:32

I'm commenting to report that we begun facing similar errors last week, after we changed several pipeline scripts, which increased the frequency with which we launch Docker containers.

We're using Jenkins 2.107.2 and docker-workflow-plugin 1.17.

I increased the plugin timeout to 250s and decreased the number of executors in our slaves to no avail.

In despair, I patched the plugin to make it ignore docker-rm errors. Using the patched plugin we were able to finish all of our weekend builds successfully.

I know this is not a decent solution. I feared that our builds could be delayed waiting for the stuck containers, but I didn't notice this or any other side effect. So, for now, we'll use this patched plugin.

I don't understand yet what makes the docker-rm command delay. Do you have some hints to share?

Gustavo Chaves added a comment - 2018-06-17 20:32 I'm commenting to report that we begun facing similar errors last week, after we changed several pipeline scripts, which increased the frequency with which we launch Docker containers. We're using Jenkins 2.107.2 and docker-workflow-plugin 1.17. I increased the plugin timeout to 250s and decreased the number of executors in our slaves to no avail. In despair, I patched the plugin to make it ignore docker-rm errors. Using the patched plugin we were able to finish all of our weekend builds successfully. I know this is not a decent solution. I feared that our builds could be delayed waiting for the stuck containers, but I didn't notice this or any other side effect. So, for now, we'll use this patched plugin. I don't understand yet what makes the docker-rm command delay. Do you have some hints to share?

Sam Van Oort added a comment - 2018-10-31 16:19

gnustavo Which version of Docker itself are you using, and do you see the timeout error, or just the error "Failed to rm container" ?

Sam Van Oort added a comment - 2018-10-31 16:19 gnustavo Which version of Docker itself are you using, and do you see the timeout error, or just the error "Failed to rm container" ?

Chris Maes added a comment - 2020-03-25 12:26

jglick: thanks for the hint, this seems to help. We now have one timeout with 280 seconds... You say around 300 and more will probably not work. Has this been verified? What kind of error or so would you expect if we put a value above 300?

Chris Maes added a comment - 2020-03-25 12:26 jglick : thanks for the hint, this seems to help. We now have one timeout with 280 seconds... You say around 300 and more will probably not work. Has this been verified? What kind of error or so would you expect if we put a value above 300?

Andrius Semionovas added a comment - 2021-08-10 12:18

Joshua Noble comment keep me wondering, why this plugin chooses to use the `cat` program as the first container process? I did some tests and found, that `cat` is not killable with SIGTERM, while `bash` is killed perfectly fine. Also, `docker stop` on the cat took 10 seconds, which means it was forcefully killed. While the same `docker stop` on bash happens immediately. Could it be related to this issue?

Andrius Semionovas added a comment - 2021-08-10 12:18 Joshua Noble comment keep me wondering, why this plugin chooses to use the `cat` program as the first container process? I did some tests and found, that `cat` is not killable with SIGTERM, while `bash` is killed perfectly fine. Also, `docker stop` on the cat took 10 seconds, which means it was forcefully killed. While the same `docker stop` on bash happens immediately. Could it be related to this issue?

Jesse Glick added a comment - 2021-08-10 22:05

cat with empty stdin is just a way to hang. Could use e.g. sleep 999999 if that binary is just as ubiquitous.

Jesse Glick added a comment - 2021-08-10 22:05 cat with empty stdin is just a way to hang. Could use e.g. sleep 999999 if that binary is just as ubiquitous.

Andrius Semionovas added a comment - 2021-08-11 09:43

jglick, yes! `cat` does its job, but it is not killable with `SIGTERM`. In the end, everything is killable, but it needs more effort from the docker.

The problem is, for some reason 180s is not enough for `docker rm` in our infra. I just do not have an idea. The problem looks unlogical to me, so my just a random idea, maybe it could be improved by using something different than `cat`?

Or another idea, maybe is it possible to do `docker rm` outside job asynchronously?

Andrius Semionovas added a comment - 2021-08-11 09:43 jglick , yes! `cat` does its job, but it is not killable with `SIGTERM`. In the end, everything is killable, but it needs more effort from the docker. The problem is, for some reason 180s is not enough for `docker rm` in our infra. I just do not have an idea. The problem looks unlogical to me, so my just a random idea, maybe it could be improved by using something different than `cat`? Or another idea, maybe is it possible to do `docker rm` outside job asynchronously?

Jenkins User added a comment - 2022-01-24 17:11 - edited

I have tried to redefine the CLIENT_TIMEOUT parameter via the JAVA_OPTS environment variable in the Docker agent definition in Jenkins DSL, without success:

agent {
 {{ docker { }}
 {{ image "**/**:1.0"}}
 {{ args "-u jenkins -v /var/run/docker.sock:/var/run/docker.sock --security-opt seccomp=unconfined --name \"${BUILD_TAG}\" -e JAVA_OPTS=\"-Dorg.jenkinsci.plugins.docker.workflow.client.DockerClient.CLIENT_TIMEOUT=240\""}}
 {{ reuseNode true}}
 {{ alwaysPull true}}
 {{ label node_label}}
 {{ }}}
 }
{{}}

I check the environment variables inside the Docker container, during the pipeline execution, and the JAVA_OPTS variable is well set:

~$ docker inspect ******* | jq '.[] | .Config.Env'
...  "JAVA_OPTS=-Dorg.jenkinsci.plugins.docker.workflow.client.DockerClient.CLIENT_TIMEOUT=240",

Please, I need your help. The pipeline is working right, except for a container with a size of 1,5 GB:

{{ [Pipeline] }$ docker stop --time=1 c4bf7a64a476881420dd7d03aba9d2596ce799f309a10ca66e5cdd739b6afe9c$ docker rm -f c4bf7a64a476881420dd7d03aba9d2596ce799f309a10ca66e5cdd739b6afe9c}}
 {{ ERROR: Timeout after 180 seconds[Pipeline] // withDockerContainer[Pipeline] }[Pipeline] // node[Pipeline] End of Pipelinejava.io.IOException: Failed to rm container 'c4bf7a64a476881420dd7d03aba9d2596ce799f309a10ca66e5cdd739b6afe9c'.}}
 {{ at org.jenkinsci.plugins.docker.workflow.client.DockerClient.rm(DockerClient.java:201)}}
 {{ at org.jenkinsci.plugins.docker.workflow.client.DockerClient.stop(DockerClient.java:187)}}
 {{ at org.jenkinsci.plugins.docker.workflow.WithContainerStep.destroy(WithContainerStep.java:109)}}
 {{ at org.jenkinsci.plugins.docker.workflow.WithContainerStep.access$400(WithContainerStep.java:76)}}
 {{ ...}}

Jenkins version: 2.277.4 – Docker pipeline: 1.26

Please, any help would be appreciated. Thanks in advance.

Jenkins User added a comment - 2022-01-24 17:11 - edited I have tried to redefine the CLIENT_TIMEOUT parameter via the JAVA_OPTS environment variable in the Docker agent definition in Jenkins DSL, without success: agent { {{ docker { }} {{ image "**/**:1.0" }} {{ args "-u jenkins -v / var /run/docker.sock:/ var /run/docker.sock --security-opt seccomp=unconfined --name \" ${BUILD_TAG}\ " -e JAVA_OPTS=\" -Dorg.jenkinsci.plugins.docker.workflow.client.DockerClient.CLIENT_TIMEOUT=240\""}} {{ reuseNode true }} {{ alwaysPull true }} {{ label node_label}} {{ }}} } {{}} I check the environment variables inside the Docker container, during the pipeline execution, and the JAVA_OPTS variable is well set: ~$ docker inspect ******* | jq '.[] | .Config.Env' ... "JAVA_OPTS=-Dorg.jenkinsci.plugins.docker.workflow.client.DockerClient.CLIENT_TIMEOUT=240" , Please, I need your help. The pipeline is working right, except for a container with a size of 1,5 GB: {{ [Pipeline] }$ docker stop --time=1 c4bf7a64a476881420dd7d03aba9d2596ce799f309a10ca66e5cdd739b6afe9c$ docker rm -f c4bf7a64a476881420dd7d03aba9d2596ce799f309a10ca66e5cdd739b6afe9c}} {{ ERROR: Timeout after 180 seconds[Pipeline] // withDockerContainer[Pipeline] }[Pipeline] // node[Pipeline] End of Pipelinejava.io.IOException: Failed to rm container 'c4bf7a64a476881420dd7d03aba9d2596ce799f309a10ca66e5cdd739b6afe9c' .}} {{ at org.jenkinsci.plugins.docker.workflow.client.DockerClient.rm(DockerClient.java:201)}} {{ at org.jenkinsci.plugins.docker.workflow.client.DockerClient.stop(DockerClient.java:187)}} {{ at org.jenkinsci.plugins.docker.workflow.WithContainerStep.destroy(WithContainerStep.java:109)}} {{ at org.jenkinsci.plugins.docker.workflow.WithContainerStep.access$400(WithContainerStep.java:76)}} {{ ...}} Jenkins version: 2.277.4 – Docker pipeline: 1.26 Please, any help would be appreciated. Thanks in advance.

Jenkins User added a comment - 2022-01-24 17:51 - edited

chrismaes How did you manage to increase the client timeout to 280 seconds? Thanks.

Jenkins User added a comment - 2022-01-24 17:51 - edited chrismaes How did you manage to increase the client timeout to 280 seconds? Thanks.

Jenkins User added a comment - 2022-03-14 16:29

jglick Please, could you help me? I don´t know how to configure this property for CLIENT_TIMEOUT? Is it possible in the Jenkins DSL Pipeline? If not, I understand that it is possible to define it in the configuration of the Jenkins nodes, don´t I? You can check my doubts, two messages before. Thanks in advance.

Jenkins User added a comment - 2022-03-14 16:29 jglick Please, could you help me? I don´t know how to configure this property for CLIENT_TIMEOUT? Is it possible in the Jenkins DSL Pipeline? If not, I understand that it is possible to define it in the configuration of the Jenkins nodes, don´t I? You can check my doubts, two messages before. Thanks in advance.

Assignee:: Jesse Glick

Reporter:: Kevin REMY

Votes:: 34 Vote for this issue

Watchers:: 51 Start watching this issue

Created:: 2017-02-25 19:29

Updated:: 2022-03-14 16:29

Resolved:: 2017-06-16 13:28

Jenkins

Details

Description

Attachments

Attachments

Issue Links

Activity

Collapse comment: Spoon added a comment - 2017-03-01 17:35

Expand comment: Spoon added a comment - 2017-03-01 17:35

Collapse comment: Spoon added a comment - 2017-03-01 17:56, Edited by Spoon - 2017-03-01 18:01

Expand comment: Spoon added a comment - 2017-03-01 17:56, Edited by Spoon - 2017-03-01 18:01

Collapse comment: Nazim Sitmanbetov added a comment - 2017-03-10 18:01

Expand comment: Nazim Sitmanbetov added a comment - 2017-03-10 18:01

Collapse comment: Filip Pytloun added a comment - 2017-03-13 09:37, Edited by Filip Pytloun - 2017-03-13 09:37

Expand comment: Filip Pytloun added a comment - 2017-03-13 09:37, Edited by Filip Pytloun - 2017-03-13 09:37

Collapse comment: Hugh Saunders added a comment - 2017-03-17 15:17

Expand comment: Hugh Saunders added a comment - 2017-03-17 15:17

Collapse comment: Spoon added a comment - 2017-03-20 11:57

Expand comment: Spoon added a comment - 2017-03-20 11:57

Collapse comment: Yuriy R added a comment - 2017-03-27 11:16, Edited by Yuriy R - 2017-03-27 11:21

Expand comment: Yuriy R added a comment - 2017-03-27 11:16, Edited by Yuriy R - 2017-03-27 11:21

Collapse comment: Kevin Yu added a comment - 2017-03-30 18:44

Expand comment: Kevin Yu added a comment - 2017-03-30 18:44

Collapse comment: Edmund Haselwanter added a comment - 2017-04-21 06:25, Edited by Edmund Haselwanter - 2017-04-21 06:25

Expand comment: Edmund Haselwanter added a comment - 2017-04-21 06:25, Edited by Edmund Haselwanter - 2017-04-21 06:25

Collapse comment: Kyle Wayman added a comment - 2017-05-03 01:25

Expand comment: Kyle Wayman added a comment - 2017-05-03 01:25

Collapse comment: Jesse Glick added a comment - 2017-05-03 16:42

Expand comment: Jesse Glick added a comment - 2017-05-03 16:42

Collapse comment: Oded Arbel added a comment - 2017-05-08 07:04

Expand comment: Oded Arbel added a comment - 2017-05-08 07:04

Collapse comment: Maxim Leonovich added a comment - 2017-05-08 23:30

Expand comment: Maxim Leonovich added a comment - 2017-05-08 23:30

Collapse comment: Michael Rose added a comment - 2017-05-11 17:15, Edited by Michael Rose - 2017-05-11 17:16

Expand comment: Michael Rose added a comment - 2017-05-11 17:15, Edited by Michael Rose - 2017-05-11 17:16

Collapse comment: Joshua Noble added a comment - 2017-05-11 18:12

Expand comment: Joshua Noble added a comment - 2017-05-11 18:12

Collapse comment: Joshua Noble added a comment - 2017-05-15 22:33

Expand comment: Joshua Noble added a comment - 2017-05-15 22:33

Collapse comment: Joshua Noble added a comment - 2017-05-16 21:01

Expand comment: Joshua Noble added a comment - 2017-05-16 21:01

Collapse comment: Pavel Georgiev added a comment - 2017-06-07 11:43

Expand comment: Pavel Georgiev added a comment - 2017-06-07 11:43

Collapse comment: Wilfried Goesgens added a comment - 2017-06-07 12:28

Expand comment: Wilfried Goesgens added a comment - 2017-06-07 12:28

Collapse comment: Pavel Georgiev added a comment - 2017-06-07 13:33

Expand comment: Pavel Georgiev added a comment - 2017-06-07 13:33

Collapse comment: Joshua Noble added a comment - 2017-06-07 13:50, Edited by Joshua Noble - 2017-06-07 13:57

Expand comment: Joshua Noble added a comment - 2017-06-07 13:50, Edited by Joshua Noble - 2017-06-07 13:57

Collapse comment: Wilfried Goesgens added a comment - 2017-06-08 17:13

Expand comment: Wilfried Goesgens added a comment - 2017-06-08 17:13

Collapse comment: Daniel Oberdick added a comment - 2017-06-09 06:38

Expand comment: Daniel Oberdick added a comment - 2017-06-09 06:38

Collapse comment: Pavel Georgiev added a comment - 2017-06-15 12:25

Expand comment: Pavel Georgiev added a comment - 2017-06-15 12:25

Collapse comment: Joshua Noble added a comment - 2017-06-15 14:45

Expand comment: Joshua Noble added a comment - 2017-06-15 14:45

Collapse comment: Joshua Noble added a comment - 2017-06-15 14:46, Edited by Joshua Noble - 2017-06-15 14:47

Expand comment: Joshua Noble added a comment - 2017-06-15 14:46, Edited by Joshua Noble - 2017-06-15 14:47

Collapse comment: SCM/JIRA link daemon added a comment - 2017-06-16 13:27

Expand comment: SCM/JIRA link daemon added a comment - 2017-06-16 13:27

Collapse comment: Pavel Georgiev added a comment - 2017-06-18 10:55

Expand comment: Pavel Georgiev added a comment - 2017-06-18 10:55

Collapse comment: Jesse Glick added a comment - 2017-06-18 15:23

Expand comment: Jesse Glick added a comment - 2017-06-18 15:23

Collapse comment: Michael Rose added a comment - 2017-06-19 15:14

Expand comment: Michael Rose added a comment - 2017-06-19 15:14

Collapse comment: Jesse Glick added a comment - 2017-06-19 16:05

Expand comment: Jesse Glick added a comment - 2017-06-19 16:05

Collapse comment: Cristian Magherusan-Stanciu added a comment - 2017-09-13 14:19, Edited by Cristian Magherusan-Stanciu - 2017-09-13 14:35

Expand comment: Cristian Magherusan-Stanciu added a comment - 2017-09-13 14:19, Edited by Cristian Magherusan-Stanciu - 2017-09-13 14:35

Collapse comment: Jesse Glick added a comment - 2017-09-18 21:20

Expand comment: Jesse Glick added a comment - 2017-09-18 21:20

Collapse comment: Benjamin Henrion added a comment - 2018-02-05 14:01

Expand comment: Benjamin Henrion added a comment - 2018-02-05 14:01

Collapse comment: Benjamin Henrion added a comment - 2018-02-07 08:45

Expand comment: Benjamin Henrion added a comment - 2018-02-07 08:45

Collapse comment: Benjamin Henrion added a comment - 2018-02-07 08:52

Expand comment: Benjamin Henrion added a comment - 2018-02-07 08:52