Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-42187

Docker plugin causes queue hanging in the case of hanging requests

      Stacktrace explanation: When DockerOnceRetentionStrategy runs, it locks the Jenkins Queue. If it decides to terminate a Cloud agent, the a DockerSlave#_terminate() gets invoked. This method invokes the REST API call using docker-java. This call has no timeout: https://github.com/jenkinsci/docker-plugin/blob/master/docker-plugin/src/main/java/com/nirima/jenkins/plugins/docker/DockerSlave.java#L168 .

      If the REST API hangs due to whatever reason, the entire Queue hangs till the request gets interrupted somehow. We see it on one of the instances, where containers cannot be terminated sometimes.

      Queue hanging causes massive outage of the Jenkins functionality, including build scheduling and particular UI widgets.

      IMHO all calls to Docker Java REST API in the plugin should have the timeout specified. E.g. Yet Another Docker Plugin does it: https://github.com/KostyaSha/yet-another-docker-plugin/blob/6853301c885447ca31648d6cfa4861e6e272bf16/yet-another-docker-plugin/src/main/java/com/github/kostyasha/yad/commons/DockerStopContainer.java#L37-L41

      java.lang.Thread.State: RUNNABLE
      at java.net.SocketInputStream.socketRead0(Native Method)
      at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
      at java.net.SocketInputStream.read(SocketInputStream.java:170)
      at java.net.SocketInputStream.read(SocketInputStream.java:141)
      at sun.security.ssl.InputRecord.readFully(InputRecord.java:465)
      at sun.security.ssl.InputRecord.read(InputRecord.java:503)
      at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:973)
      - locked <0x0000000704c04ba8> (a java.lang.Object)
      at sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:930)
      at sun.security.ssl.AppInputStream.read(AppInputStream.java:105)
      - locked <0x0000000704c06bc8> (a sun.security.ssl.AppInputStream)
      at org.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:137)
      at org.apache.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:153)
      at org.apache.http.impl.io.SessionInputBufferImpl.readLine(SessionInputBufferImpl.java:282)
      at org.apache.http.impl.io.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:129)
      at org.apache.http.impl.io.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:53)
      at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:259)
      at org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:163)
      at org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:167)
      at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:273)
      at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125)
      at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:271)
      at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:184)
      at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:88)
      at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110)
      at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:184)
      at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:71)
      at org.glassfish.jersey.apache.connector.ApacheConnector.apply(ApacheConnector.java:435)
      at org.glassfish.jersey.client.ClientRuntime.invoke(ClientRuntime.java:252)
      at org.glassfish.jersey.client.JerseyInvocation$1.call(JerseyInvocation.java:684)
      at org.glassfish.jersey.client.JerseyInvocation$1.call(JerseyInvocation.java:681)
      at org.glassfish.jersey.internal.Errors.process(Errors.java:315)
      at org.glassfish.jersey.internal.Errors.process(Errors.java:297)
      at org.glassfish.jersey.internal.Errors.process(Errors.java:228)
      at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:444)
      at org.glassfish.jersey.client.JerseyInvocation.invoke(JerseyInvocation.java:681)
      at org.glassfish.jersey.client.JerseyInvocation$Builder.method(JerseyInvocation.java:437)
      at org.glassfish.jersey.client.JerseyInvocation$Builder.post(JerseyInvocation.java:343)
      at com.github.dockerjava.jaxrs.StopContainerCmdExec.execute(StopContainerCmdExec.java:31)
      at com.github.dockerjava.jaxrs.StopContainerCmdExec.execute(StopContainerCmdExec.java:12)
      at com.github.dockerjava.jaxrs.AbstrSyncDockerCmdExec.exec(AbstrSyncDockerCmdExec.java:23)
      at com.github.dockerjava.core.command.AbstrDockerCmd.exec(AbstrDockerCmd.java:35)
      at com.github.dockerjava.core.command.StopContainerCmdImpl.exec(StopContainerCmdImpl.java:63)
      at com.nirima.jenkins.plugins.docker.DockerSlave._terminate(DockerSlave.java:168)
      at hudson.slaves.AbstractCloudSlave.terminate(AbstractCloudSlave.java:67)
      at com.nirima.jenkins.plugins.docker.strategy.DockerOnceRetentionStrategy$1$1.run(DockerOnceRetentionStrategy.java:112)
      at hudson.model.Queue._withLock(Queue.java:1306)
      at hudson.model.Queue.withLock(Queue.java:1189)
      at com.nirima.jenkins.plugins.docker.strategy.DockerOnceRetentionStrategy$1.run(DockerOnceRetentionStrategy.java:106)
      

          [JENKINS-42187] Docker plugin causes queue hanging in the case of hanging requests

          Oleg Nenashev created issue -
          Nicolas De Loof made changes -
          Assignee Original: magnayn [ magnayn ] New: Nicolas De Loof [ ndeloof ]
          Nicolas De Loof made changes -
          Link New: This issue is duplicated by JENKINS-38369 [ JENKINS-38369 ]
          Nicolas De Loof made changes -
          Nicolas De Loof made changes -
          Remote Link New: This issue links to "PR (Web Link)" [ 17672 ]
          Nicolas De Loof made changes -
          Resolution New: Fixed [ 1 ]
          Status Original: Open [ 1 ] New: Closed [ 6 ]
          James Dumay made changes -
          Remote Link New: This issue links to "CloudBees Internal OSS-2010 (Web Link)" [ 18460 ]

            ndeloof Nicolas De Loof
            oleg_nenashev Oleg Nenashev
            Votes:
            1 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated:
              Resolved: