Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-37769

Too long stop docker container(default docker stop timeout=10s)

    • Icon: Improvement Improvement
    • Resolution: Fixed
    • Icon: Major Major
    • docker-workflow-plugin
    • None
    • CloudBees Docker Pipeline 1.7
      Jenkins ver. 2.7

      Please use docker stop --time 0 when stop container for inside{}
      inside construction generate following cli call

      00:00:55.724 $ docker run -t -d -u 1000:1000 -w ... -v ...:rw -v ...@tmp:rw -e * tag cat
      00:00:56.357 + <shell command>
      00:00:56.369 $ docker stop 
      00:01:06.542 $ docker rm -f 
      

      but docker stop stop this cat command 10 seconds - too long

      docker stop --help
      
      Usage:  docker stop [OPTIONS] CONTAINER [CONTAINER...]
      
      Stop one or more running containers
      
      Options:
            --help       Print usage
        -t, --time int   Seconds to wait for stop before killing it (default 10)
      

      Experiment 1

      $ time docker run -t -d --name test ubuntu cat
      af6c228448ace32c66dba70efa2cc6189bc35dccdb2b950544e2a2e807a1d955
      
      real    0m0.289s
      user    0m0.016s
      sys     0m0.004s
      
      $ time docker stop test
      test
      
      real    0m10.206s
      user    0m0.008s
      sys     0m0.008s
      
      $ time docker rm -f test
      test
      
      real    0m0.021s
      user    0m0.008s
      sys     0m0.008s
      

      Experiment 2

      $ time docker run -t -d --name test ubuntu cat
      5d02979c9928e47455981f162707590122b5b124ec8748740273846201704a17
      
      real    0m0.324s
      user    0m0.008s
      sys     0m0.004s
      
      $ time docker stop --time 0 test
      test
      
      real    0m0.197s
      user    0m0.012s
      sys     0m0.000s
      
      $ time docker rm -f test
      test
      
      real    0m0.020s
      user    0m0.008s
      sys     0m0.004s
      

          [JENKINS-37769] Too long stop docker container(default docker stop timeout=10s)

          Update - i use last docker v1.12 and Ubuntu 16.04 x64 LTS

          Oleksii Trekhov added a comment - Update - i use last docker v1.12 and Ubuntu 16.04 x64 LTS

          Jesse Glick added a comment -

          If stop is hitting the default 10s timeout, something deeper is wrong. I doubt that simply passing --time 0 is correcting that.

          Jesse Glick added a comment - If stop is hitting the default 10s timeout, something deeper is wrong. I doubt that simply passing --time 0 is correcting that.

          Gabor Almer added a comment -

          Do you really need that cat to run?

          $ time docker run -t -d --name test ubuntu
          d0b48a93c51f4d56a5bcb206bb2cad24cdf16546118a320c2201ee13f068e218
          
          real    0m0.376s
          user    0m0.008s
          sys     0m0.009s
          

          And it is running

          $ docker ps
          CONTAINER ID        IMAGE                                      COMMAND                  CREATED             STATUS              PORTS                                                          NAMES
          d0b48a93c51f        ubuntu                                     "/bin/bash"              8 seconds ago       Up 7 seconds                                                                       test
          

          And if a stop it, it runs almost right away:

          $ time docker stop test
          test
          
          real    0m0.325s
          user    0m0.007s
          sys     0m0.007s
          

          So can we skip passing cat as command when running a container?

          Gabor Almer added a comment - Do you really need that cat to run? $ time docker run -t -d --name test ubuntu d0b48a93c51f4d56a5bcb206bb2cad24cdf16546118a320c2201ee13f068e218 real 0m0.376s user 0m0.008s sys 0m0.009s And it is running $ docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES d0b48a93c51f ubuntu "/bin/bash" 8 seconds ago Up 7 seconds test And if a stop it, it runs almost right away: $ time docker stop test test real 0m0.325s user 0m0.007s sys 0m0.007s So can we skip passing cat as command when running a container?

          I am seeing a similar problem on a CentOS with Jenkins 2.21 and docker 1.12.1. When I use a pipline with an inside shell command, the container is stopped after +/-10s. In a simple script like this:

          node {
              stage('Checkout') {
                  checkout scm
              }
          
              stage('Unit Tests') {
                  img = docker.image("fedora:24")
                  img.inside {
                      sh "sleep 120"
                  }
              }
          }
          

          The unit tests stage only runs for +/- 10s instead of 120s

          Bart Vanbrabant added a comment - I am seeing a similar problem on a CentOS with Jenkins 2.21 and docker 1.12.1. When I use a pipline with an inside shell command, the container is stopped after +/-10s. In a simple script like this: node { stage('Checkout') { checkout scm } stage('Unit Tests') { img = docker.image("fedora:24") img.inside { sh "sleep 120" } } } The unit tests stage only runs for +/- 10s instead of 120s

          Jesse Glick added a comment -

          t_rex I can reproduce the pause. I am still trying to determine why it is pausing. From what I can tell so far, the container is killed right away (presumably after the SIGTERM), so I am wondering why it is then waiting for the timeout to expire, supposedly to send a SIGKILL. Perhaps this is a bug in Docker. Needs more investigation.

          gabealmer

          can we skip passing cat as command when running a container?

          The problem is determining which images that would apply to.

          bartvanbrabant your problem seems to be unrelated, far more serious, and specific to the fedora image (perhaps). I think it has something to do with different procfs handling. Please file it separately.

          Jesse Glick added a comment - t_rex I can reproduce the pause. I am still trying to determine why it is pausing. From what I can tell so far, the container is killed right away (presumably after the SIGTERM ), so I am wondering why it is then waiting for the timeout to expire, supposedly to send a SIGKILL . Perhaps this is a bug in Docker. Needs more investigation. gabealmer can we skip passing cat as command when running a container? The problem is determining which images that would apply to. bartvanbrabant your problem seems to be unrelated, far more serious, and specific to the fedora image (perhaps). I think it has something to do with different procfs handling. Please file it separately.

          Jesse Glick added a comment -

          At first I thought it was this problem with systemd, but --stop-signal=RTMIN+3 does not work. Anyway the overridden --entrypoint means that systemd is not in play.

          Jesse Glick added a comment - At first I thought it was this problem with systemd , but --stop-signal=RTMIN+3 does not work. Anyway the overridden --entrypoint means that systemd is not in play.

          Jesse Glick added a comment -

          Not following what is going on. If you run

          bash -c 'trap "echo term" EXIT; sleep 120'
          

          and then kill this bash process from another terminal window, you see

          term
          Terminated
          

          immediately. But if you run

          docker run --rm --name test ubuntu bash -c 'trap "echo term" EXIT; sleep 120'
          

          and from another window

          docker stop test
          

          you will see

          term
          

          and then ten seconds will elapse before the container exits. Somehow the TERM signal is getting sent, but not properly handled. (sleep is still running during this time.)

          Jesse Glick added a comment - Not following what is going on. If you run bash -c 'trap "echo term" EXIT; sleep 120' and then kill this bash process from another terminal window, you see term Terminated immediately. But if you run docker run --rm --name test ubuntu bash -c 'trap "echo term" EXIT; sleep 120' and from another window docker stop test you will see term and then ten seconds will elapse before the container exits. Somehow the TERM signal is getting sent, but not properly handled. ( sleep is still running during this time.)

          Jesse Glick added a comment -

          Stranger still:

          docker run --rm --name test ubuntu sleep infinity
          

          Now docker exec test ps fauxwww confirms that sleep is PID 1, yet docker exec test kill -9 1 does not do anything. docker stop test will pause for ten seconds. I suspect that somehow signal delivery is just getting blocked to PID 1. On the other hand, with

          docker run --rm --name test --stop-signal=KILL ubuntu sleep infinity
          

          you will see that docker stop test works promptly. Perhaps in this case the stop command is just being clever and knows that it makes no sense to wait after sending SIGKILL, so behaves like --time=0?

          Really not sure what is going on. Passing --time 0 does correct the symptom, without addressing the underlying problem. But it is less than clear that a clean TERM is being delivered to any spawned processes even in the current state—it seems that TERM is delivered to the entrypoint ((cat) but does nothing.

          I am (dimly) aware of zombie reaping issues but that does not seem to be relevant in this case.

          Jesse Glick added a comment - Stranger still: docker run --rm --name test ubuntu sleep infinity Now docker exec test ps fauxwww confirms that sleep is PID 1, yet docker exec test kill -9 1 does not do anything. docker stop test will pause for ten seconds. I suspect that somehow signal delivery is just getting blocked to PID 1. On the other hand, with docker run --rm --name test --stop-signal=KILL ubuntu sleep infinity you will see that docker stop test works promptly. Perhaps in this case the stop command is just being clever and knows that it makes no sense to wait after sending SIGKILL , so behaves like --time=0 ? Really not sure what is going on. Passing --time 0 does correct the symptom, without addressing the underlying problem. But it is less than clear that a clean TERM is being delivered to any spawned processes even in the current state—it seems that TERM is delivered to the entrypoint (( cat ) but does nothing. I am (dimly) aware of zombie reaping issues but that does not seem to be relevant in this case.

          Code changed in jenkins
          User: Jesse Glick
          Path:
          src/main/java/org/jenkinsci/plugins/docker/workflow/client/DockerClient.java
          src/test/java/org/jenkinsci/plugins/docker/workflow/client/DockerClientTest.java
          http://jenkins-ci.org/commit/docker-workflow-plugin/0fed7a702751f743eb3603092e1731f8979930a5
          Log:
          [FIXED JENKINS-37769] Failing to stop containers cleanly, so at least stopping them quickly.

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Jesse Glick Path: src/main/java/org/jenkinsci/plugins/docker/workflow/client/DockerClient.java src/test/java/org/jenkinsci/plugins/docker/workflow/client/DockerClientTest.java http://jenkins-ci.org/commit/docker-workflow-plugin/0fed7a702751f743eb3603092e1731f8979930a5 Log: [FIXED JENKINS-37769] Failing to stop containers cleanly, so at least stopping them quickly.

          This is due to the fact that signals are properly propagated without the -init flag:

           

          docker run --rm --name=test --init ubuntu sleep infinity
          

          See JENKINS-45888 for an issue for adding -init by default.  Changing the -time= flag is just papering over the real problem.

          Christian Höltje added a comment - This is due to the fact that signals are properly propagated without the -init flag:   docker run --rm --name=test --init ubuntu sleep infinity See JENKINS-45888 for an issue for adding - init by default.  Changing the -time= flag is just papering over the real problem.

            jglick Jesse Glick
            t_rex Oleksii Trekhov
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

              Created:
              Updated:
              Resolved: