Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-28890

Docker terminates images twice and fails with SEVERE exception on the second attempt

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Major Major
    • docker-plugin
    • docker-0.9.2
      jenkins-1.609.1

      Seems to be a regression, because everything works well in docker-plugin-0.8 (which does not have DockerOnceRetentionStrategy). Appears on the latest Jenkins LTS. Installation from scratch

      1) Build completes successfully
      2) Termination request comes from the default DockerOnceRetentionStrategy
      3) Docker image terminates with a WARNING message (should be info, I think)
      4) After some time, Jenkins starts its own cloud retention strategy
      5) Docker tries to terminate the image again and gets a SEVERE exception due to the missing container

      Jun 12, 2015 6:54:03 PM com.nirima.jenkins.plugins.docker.DockerCloud provision
      INFO: Asked to provision 1 slave(s) for: docker
      Jun 12, 2015 6:54:03 PM com.nirima.jenkins.plugins.docker.DockerCloud provision
      INFO: Will provision "evarga/jenkins-slave" for: docker
      Jun 12, 2015 6:54:03 PM com.nirima.jenkins.plugins.docker.DockerCloud addProvisionedSlave
      INFO: Provisioning "evarga/jenkins-slave" number 0 on "shared-docker-cloud"; Total containers: 0
      Launching evarga/jenkins-slave
      Jun 12, 2015 6:54:03 PM hudson.slaves.NodeProvisioner$StandardStrategyImpl apply
      INFO: Started provisioning Image of evarga/jenkins-slave from shared-docker-cloud with 1 executors. Remaining excess workload: 0
      Jun 12, 2015 6:54:03 PM com.nirima.jenkins.plugins.docker.DockerComputerLauncher getSSHLauncher
      INFO: Creating slave SSH launcher for 192.168.59.103:32771
      [06/12/15 18:54:11] SSH Launch of acd142e301a0@shared-docker-cloud on 192.168.59.103 completed in 7,253 ms
      Jun 12, 2015 6:54:11 PM hudson.model.Run execute
      INFO: test-docker #2 main build action completed: SUCCESS
      Jun 12, 2015 6:54:11 PM hudson.model.Executor finish2
      WARNING: Executor #0 for acd142e301a0@shared-docker-cloud : executing test-docker #2 termination trace
      hudson.model.Computer$TerminationRequest: Termination requested at Fri Jun 12 18:54:11 MSK 2015 by Thread[Executor #0 for acd142e301a0@shared-docker-cloud : executing test-docker #2,5,main] [id=289]
      	at hudson.model.Computer.recordTermination(Computer.java:214)
      	at hudson.slaves.AbstractCloudSlave.terminate(AbstractCloudSlave.java:63)
      	at com.nirima.jenkins.plugins.docker.strategy.DockerOnceRetentionStrategy.done(DockerOnceRetentionStrategy.java:77)
      	at com.nirima.jenkins.plugins.docker.strategy.DockerOnceRetentionStrategy.taskCompleted(DockerOnceRetentionStrategy.java:59)
      	at hudson.slaves.SlaveComputer.taskCompleted(SlaveComputer.java:301)
      	at com.nirima.jenkins.plugins.docker.DockerComputer.taskCompleted(DockerComputer.java:63)
      	at hudson.model.queue.WorkUnitContext.synchronizeEnd(WorkUnitContext.java:145)
      	at hudson.model.Executor.finish1(Executor.java:424)
      	at hudson.model.Executor.run(Executor.java:394)
      
      Jun 12, 2015 6:54:11 PM hudson.model.Executor finish2
      WARNING: Executor #0 for acd142e301a0@shared-docker-cloud : executing test-docker #2 termination trace
      hudson.model.Computer$TerminationRequest: Termination requested at Fri Jun 12 18:54:11 MSK 2015 by Thread[Executor #0 for acd142e301a0@shared-docker-cloud : executing test-docker #2,5,main] [id=289]
      	at hudson.model.Computer.recordTermination(Computer.java:214)
      	at hudson.model.Computer.disconnect(Computer.java:465)
      	at hudson.slaves.SlaveComputer.disconnect(SlaveComputer.java:601)
      	at com.nirima.jenkins.plugins.docker.DockerSlave._terminate(DockerSlave.java:101)
      	at hudson.slaves.AbstractCloudSlave.terminate(AbstractCloudSlave.java:67)
      	at com.nirima.jenkins.plugins.docker.strategy.DockerOnceRetentionStrategy.done(DockerOnceRetentionStrategy.java:77)
      	at com.nirima.jenkins.plugins.docker.strategy.DockerOnceRetentionStrategy.taskCompleted(DockerOnceRetentionStrategy.java:59)
      	at hudson.slaves.SlaveComputer.taskCompleted(SlaveComputer.java:301)
      	at com.nirima.jenkins.plugins.docker.DockerComputer.taskCompleted(DockerComputer.java:63)
      	at hudson.model.queue.WorkUnitContext.synchronizeEnd(WorkUnitContext.java:145)
      	at hudson.model.Executor.finish1(Executor.java:424)
      	at hudson.model.Executor.run(Executor.java:394)
      
      Jun 12, 2015 6:54:11 PM hudson.model.Executor finish2
      WARNING: Executor #0 for acd142e301a0@shared-docker-cloud : executing test-docker #2 termination trace
      hudson.model.Computer$TerminationRequest: Termination requested at Fri Jun 12 18:54:11 MSK 2015 by Thread[Executor #0 for acd142e301a0@shared-docker-cloud : executing test-docker #2,5,main] [id=289]
      	at hudson.model.Computer.recordTermination(Computer.java:214)
      	at jenkins.model.Nodes$3.run(Nodes.java:165)
      	at hudson.model.Queue._withLock(Queue.java:1207)
      	at hudson.model.Queue.withLock(Queue.java:1143)
      	at jenkins.model.Nodes.removeNode(Nodes.java:160)
      	at jenkins.model.Jenkins.removeNode(Jenkins.java:1700)
      	at hudson.slaves.AbstractCloudSlave.terminate(AbstractCloudSlave.java:70)
      	at com.nirima.jenkins.plugins.docker.strategy.DockerOnceRetentionStrategy.done(DockerOnceRetentionStrategy.java:77)
      	at com.nirima.jenkins.plugins.docker.strategy.DockerOnceRetentionStrategy.taskCompleted(DockerOnceRetentionStrategy.java:59)
      	at hudson.slaves.SlaveComputer.taskCompleted(SlaveComputer.java:301)
      	at com.nirima.jenkins.plugins.docker.DockerComputer.taskCompleted(DockerComputer.java:63)
      	at hudson.model.queue.WorkUnitContext.synchronizeEnd(WorkUnitContext.java:145)
      	at hudson.model.Executor.finish1(Executor.java:424)
      	at hudson.model.Executor.run(Executor.java:394)
      
      Jun 12, 2015 6:54:11 PM hudson.model.Executor finish2
      WARNING: Executor #0 for acd142e301a0@shared-docker-cloud : executing test-docker #2 termination trace
      hudson.model.Computer$TerminationRequest: Termination requested at Fri Jun 12 18:54:11 MSK 2015 by Thread[Executor #0 for acd142e301a0@shared-docker-cloud : executing test-docker #2,5,main] [id=289]
      	at hudson.model.Computer.recordTermination(Computer.java:214)
      	at hudson.model.Computer.disconnect(Computer.java:465)
      	at hudson.slaves.SlaveComputer.disconnect(SlaveComputer.java:601)
      	at jenkins.model.Nodes$3.run(Nodes.java:166)
      	at hudson.model.Queue._withLock(Queue.java:1207)
      	at hudson.model.Queue.withLock(Queue.java:1143)
      	at jenkins.model.Nodes.removeNode(Nodes.java:160)
      	at jenkins.model.Jenkins.removeNode(Jenkins.java:1700)
      	at hudson.slaves.AbstractCloudSlave.terminate(AbstractCloudSlave.java:70)
      	at com.nirima.jenkins.plugins.docker.strategy.DockerOnceRetentionStrategy.done(DockerOnceRetentionStrategy.java:77)
      	at com.nirima.jenkins.plugins.docker.strategy.DockerOnceRetentionStrategy.taskCompleted(DockerOnceRetentionStrategy.java:59)
      	at hudson.slaves.SlaveComputer.taskCompleted(SlaveComputer.java:301)
      	at com.nirima.jenkins.plugins.docker.DockerComputer.taskCompleted(DockerComputer.java:63)
      	at hudson.model.queue.WorkUnitContext.synchronizeEnd(WorkUnitContext.java:145)
      	at hudson.model.Executor.finish1(Executor.java:424)
      	at hudson.model.Executor.run(Executor.java:394)
      
      Jun 12, 2015 6:54:13 PM com.nirima.jenkins.plugins.docker.utils.RetryingComputerLauncher launch
      INFO: Launch failed, pausing before retry.
      Jun 12, 2015 6:54:13 PM hudson.slaves.NodeProvisioner$2 run
      INFO: Image of evarga/jenkins-slave provisioning successfully completed. We have now 2 computer(s)
      [06/12/15 18:54:18] SSH Launch of acd142e301a0@shared-docker-cloud on 192.168.59.103 failed in 5 ms
      Jun 12, 2015 6:54:37 PM hudson.slaves.CloudRetentionStrategy check
      INFO: Disconnecting acd142e301a0@shared-docker-cloud
      Jun 12, 2015 6:54:37 PM com.nirima.jenkins.plugins.docker.DockerSlave _terminate
      SEVERE: Failed to stop instance acd142e301a09f5ce72bb1af3b76ec9753337af81bcafcaa1b13d264cd43a869 for slave acd142e301a0@shared-docker-cloud due to exception
      com.github.dockerjava.api.NotFoundException: no such id: acd142e301a09f5ce72bb1af3b76ec9753337af81bcafcaa1b13d264cd43a869
      
      	at com.github.dockerjava.core.util.ResponseStatusExceptionFilter.filter(ResponseStatusExceptionFilter.java:48)
      	at org.glassfish.jersey.client.ClientFilteringStages$ResponseFilterStage.apply(ClientFilteringStages.java:134)
      	at org.glassfish.jersey.client.ClientFilteringStages$ResponseFilterStage.apply(ClientFilteringStages.java:123)
      	at org.glassfish.jersey.process.internal.Stages.process(Stages.java:171)
      	at org.glassfish.jersey.client.ClientRuntime.invoke(ClientRuntime.java:251)
      	at org.glassfish.jersey.client.JerseyInvocation$1.call(JerseyInvocation.java:667)
      	at org.glassfish.jersey.client.JerseyInvocation$1.call(JerseyInvocation.java:664)
      	at org.glassfish.jersey.internal.Errors.process(Errors.java:315)
      	at org.glassfish.jersey.internal.Errors.process(Errors.java:297)
      	at org.glassfish.jersey.internal.Errors.process(Errors.java:228)
      	at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:424)
      	at org.glassfish.jersey.client.JerseyInvocation.invoke(JerseyInvocation.java:664)
      	at org.glassfish.jersey.client.JerseyInvocation$Builder.method(JerseyInvocation.java:424)
      	at org.glassfish.jersey.client.JerseyInvocation$Builder.post(JerseyInvocation.java:333)
      	at com.github.dockerjava.jaxrs.StopContainerCmdExec.execute(StopContainerCmdExec.java:29)
      	at com.github.dockerjava.jaxrs.StopContainerCmdExec.execute(StopContainerCmdExec.java:11)
      	at com.github.dockerjava.jaxrs.AbstrDockerCmdExec.exec(AbstrDockerCmdExec.java:57)
      	at com.github.dockerjava.core.command.AbstrDockerCmd.exec(AbstrDockerCmd.java:29)
      	at com.github.dockerjava.core.command.StopContainerCmdImpl.exec(StopContainerCmdImpl.java:66)
      	at com.nirima.jenkins.plugins.docker.DockerSlave._terminate(DockerSlave.java:105)
      	at hudson.slaves.AbstractCloudSlave.terminate(AbstractCloudSlave.java:67)
      	at hudson.slaves.CloudRetentionStrategy.check(CloudRetentionStrategy.java:62)
      	at hudson.slaves.CloudRetentionStrategy.check(CloudRetentionStrategy.java:46)
      	at hudson.slaves.ComputerRetentionWork$1.run(ComputerRetentionWork.java:70)
      	at hudson.model.Queue._withLock(Queue.java:1207)
      	at hudson.model.Queue.withLock(Queue.java:1143)
      	at hudson.slaves.ComputerRetentionWork.doRun(ComputerRetentionWork.java:61)
      	at hudson.triggers.SafeTimerTask.run(SafeTimerTask.java:51)
      	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
      	at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
      	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
      	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
      	at java.lang.Thread.run(Thread.java:745)
      
      Jun 12, 2015 6:54:37 PM com.nirima.jenkins.plugins.docker.DockerSlave _terminate
      SEVERE: Failed to remove instance acd142e301a09f5ce72bb1af3b76ec9753337af81bcafcaa1b13d264cd43a869 for slave acd142e301a0@shared-docker-cloud due to exception
      com.github.dockerjava.api.NotFoundException: no such id: acd142e301a09f5ce72bb1af3b76ec9753337af81bcafcaa1b13d264cd43a869
      
      	at com.github.dockerjava.core.util.ResponseStatusExceptionFilter.filter(ResponseStatusExceptionFilter.java:48)
      	at org.glassfish.jersey.client.ClientFilteringStages$ResponseFilterStage.apply(ClientFilteringStages.java:134)
      	at org.glassfish.jersey.client.ClientFilteringStages$ResponseFilterStage.apply(ClientFilteringStages.java:123)
      	at org.glassfish.jersey.process.internal.Stages.process(Stages.java:171)
      	at org.glassfish.jersey.client.ClientRuntime.invoke(ClientRuntime.java:251)
      	at org.glassfish.jersey.client.JerseyInvocation$1.call(JerseyInvocation.java:667)
      	at org.glassfish.jersey.client.JerseyInvocation$1.call(JerseyInvocation.java:664)
      	at org.glassfish.jersey.internal.Errors.process(Errors.java:315)
      	at org.glassfish.jersey.internal.Errors.process(Errors.java:297)
      	at org.glassfish.jersey.internal.Errors.process(Errors.java:228)
      	at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:424)
      	at org.glassfish.jersey.client.JerseyInvocation.invoke(JerseyInvocation.java:664)
      	at org.glassfish.jersey.client.JerseyInvocation$Builder.method(JerseyInvocation.java:399)
      	at org.glassfish.jersey.client.JerseyInvocation$Builder.delete(JerseyInvocation.java:348)
      	at com.github.dockerjava.jaxrs.RemoveContainerCmdExec.execute(RemoveContainerCmdExec.java:26)
      	at com.github.dockerjava.jaxrs.RemoveContainerCmdExec.execute(RemoveContainerCmdExec.java:11)
      	at com.github.dockerjava.jaxrs.AbstrDockerCmdExec.exec(AbstrDockerCmdExec.java:57)
      	at com.github.dockerjava.core.command.AbstrDockerCmd.exec(AbstrDockerCmd.java:29)
      	at com.github.dockerjava.core.command.RemoveContainerCmdImpl.exec(RemoveContainerCmdImpl.java:77)
      	at com.nirima.jenkins.plugins.docker.DockerSlave._terminate(DockerSlave.java:121)
      	at hudson.slaves.AbstractCloudSlave.terminate(AbstractCloudSlave.java:67)
      	at hudson.slaves.CloudRetentionStrategy.check(CloudRetentionStrategy.java:62)
      	at hudson.slaves.CloudRetentionStrategy.check(CloudRetentionStrategy.java:46)
      	at hudson.slaves.ComputerRetentionWork$1.run(ComputerRetentionWork.java:70)
      	at hudson.model.Queue._withLock(Queue.java:1207)
      	at hudson.model.Queue.withLock(Queue.java:1143)
      	at hudson.slaves.ComputerRetentionWork.doRun(ComputerRetentionWork.java:61)
      	at hudson.triggers.SafeTimerTask.run(SafeTimerTask.java:51)
      	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
      	at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
      	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
      	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
      	at java.lang.Thread.run(Thread.java:745)
      
      

          [JENKINS-28890] Docker terminates images twice and fails with SEVERE exception on the second attempt

          Oleg Nenashev added a comment -

          > Please stop mentioning 0.8 because it had different docker libraries and everything else.

          Regression is a regression. Even if you don't need this version, other users may find this info useful when they make decisions regarding upgrades.

          Oleg Nenashev added a comment - > Please stop mentioning 0.8 because it had different docker libraries and everything else. Regression is a regression. Even if you don't need this version, other users may find this info useful when they make decisions regarding upgrades.

          Also note that current provisioning mechanism is known to be bad. I hope that soon this will be changed. You can follow, try new implementation and provide your feedback for https://github.com/jenkinsci/docker-plugin/pull/234#issuecomment-110921198

          Kanstantsin Shautsou added a comment - Also note that current provisioning mechanism is known to be bad. I hope that soon this will be changed. You can follow, try new implementation and provide your feedback for https://github.com/jenkinsci/docker-plugin/pull/234#issuecomment-110921198

          T.B. Anton added a comment -

          I am seeing a very similar stack trace, albeit from windows build executor (beginning of the stack trace):

          WARNING: Executor #1 for teams-build-win : executing TEST-build-windows-libraries #20 termination trace
          hudson.model.Computer$TerminationRequest: Termination requested at Tue Aug 04 12:18:23 CEST 2015 by Thread[Handling POST /computer/tea
          ms-build-win/doDisconnect from 10.96.1.213 : RequestHandlerThread[#5],5,main] [id=26] from HTTP request for http://jenkins.tba.intern/
          computer/teams-build-win/doDisconnect
                  at hudson.model.Computer.recordTermination(Computer.java:205)
                  at hudson.model.Computer.disconnect(Computer.java:465)
                  at hudson.slaves.SlaveComputer.disconnect(SlaveComputer.java:601)
                  at hudson.slaves.SlaveComputer.doDoDisconnect(SlaveComputer.java:594)
                  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
                  at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
                  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
                  at java.lang.reflect.Method.invoke(Method.java:606)
          

          Could this be related? Or should I open another ticket?

          T.B. Anton added a comment - I am seeing a very similar stack trace, albeit from windows build executor (beginning of the stack trace): WARNING: Executor #1 for teams-build-win : executing TEST-build-windows-libraries #20 termination trace hudson.model.Computer$TerminationRequest: Termination requested at Tue Aug 04 12:18:23 CEST 2015 by Thread[Handling POST /computer/tea ms-build-win/doDisconnect from 10.96.1.213 : RequestHandlerThread[#5],5,main] [id=26] from HTTP request for http://jenkins.tba.intern/ computer/teams-build-win/doDisconnect at hudson.model.Computer.recordTermination(Computer.java:205) at hudson.model.Computer.disconnect(Computer.java:465) at hudson.slaves.SlaveComputer.disconnect(SlaveComputer.java:601) at hudson.slaves.SlaveComputer.doDoDisconnect(SlaveComputer.java:594) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) Could this be related? Or should I open another ticket?

          Firstly verify that you are running latest versions.

          Kanstantsin Shautsou added a comment - Firstly verify that you are running latest versions.

          T.B. Anton added a comment -

          We are also running Jenkins 1.609.1.

          T.B. Anton added a comment - We are also running Jenkins 1.609.1.

          I mean docker plugin

          Kanstantsin Shautsou added a comment - I mean docker plugin

          T.B. Anton added a comment -

          We are not using docker. Yet, the stack traces look very similar. That's why I was wondering if what I'm seeing from a Windows slave could be in any way related (root cause).

          T.B. Anton added a comment - We are not using docker. Yet, the stack traces look very similar. That's why I was wondering if what I'm seeing from a Windows slave could be in any way related (root cause).

          T.B. Anton added a comment -

          I think I found the cause of the doDisconnect. The script, that was executed on the Windows slave, called 'exit' on the last line. Since I removed that line everything seems to be working again. Sorry about the noise.

          T.B. Anton added a comment - I think I found the cause of the doDisconnect. The script, that was executed on the Windows slave, called 'exit' on the last line. Since I removed that line everything seems to be working again. Sorry about the noise.

          oleg_nenashev i feel that this issue should be resolved in latest plugin versions. Please try 0.11.0 for example and reopen if this issue still exists.

          Kanstantsin Shautsou added a comment - oleg_nenashev i feel that this issue should be resolved in latest plugin versions. Please try 0.11.0 for example and reopen if this issue still exists.

          Oleg Nenashev added a comment -

          I'll try to check it on the next week

          Oleg Nenashev added a comment - I'll try to check it on the next week

            integer Kanstantsin Shautsou
            oleg_nenashev Oleg Nenashev
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: