Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-65125

JNLP agents fails to connect sometimes after upgrading to Jenkins 2.277.1

    XMLWordPrintable

    Details

    • Type: Improvement
    • Status: Closed (View Workflow)
    • Priority: Major
    • Resolution: Not A Defect
    • Component/s: kubernetes-plugin, remoting
    • Labels:
      None
    • Environment:
      Jenkins 2.277.1
      Agent Remoting 4.7
      AdoptOpenJDK 8 in both Jenkins Master container and the Agent
      Kubernetes Plugin 1.29.2
    • Similar Issues:

      Description

      Since I upgraded to Jenkins 2.277.1 (via docker), most of the times that the Kubernetes Plugin tries to create new agents it fails. This is causing the jobs to stay in the queue for a very long time.

      This is the trace which `jenkins-agent` gives in the container logs:

      Mar 15, 2021 9:21:36 PM hudson.remoting.jnlp.Main createEngine
      INFO: Setting up agent: small-volume-rpzzd
      Mar 15, 2021 9:21:36 PM hudson.remoting.jnlp.Main$CuiListener <init>
      INFO: Jenkins agent is running in headless mode.
      Mar 15, 2021 9:21:36 PM hudson.remoting.Engine startEngine
      INFO: Using Remoting version: 4.7
      Mar 15, 2021 9:21:36 PM org.jenkinsci.remoting.engine.WorkDirManager initializeWorkDir
      INFO: Using /home/jenkins/agent/remoting as a remoting work directory
      Mar 15, 2021 9:21:36 PM org.jenkinsci.remoting.engine.WorkDirManager setupLogging
      INFO: Both error and output logs will be printed to /home/jenkins/agent/remoting
      Mar 15, 2021 9:21:56 PM hudson.remoting.jnlp.Main$CuiListener error
      SEVERE: Connection failed.
      io.jenkins.remoting.shaded.javax.websocket.DeploymentException: Connection failed.
              at io.jenkins.remoting.shaded.org.glassfish.tyrus.container.jdk.client.JdkClientContainer$1.call(JdkClientContainer.java:187)
              at io.jenkins.remoting.shaded.org.glassfish.tyrus.container.jdk.client.JdkClientContainer$1.call(JdkClientContainer.java:107)
              at io.jenkins.remoting.shaded.org.glassfish.tyrus.container.jdk.client.JdkClientContainer.openClientSocket(JdkClientContainer.java:192)
              at io.jenkins.remoting.shaded.org.glassfish.tyrus.client.ClientManager$3$1.run(ClientManager.java:647)
              at io.jenkins.remoting.shaded.org.glassfish.tyrus.client.ClientManager$3.run(ClientManager.java:696)
              at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
              at java.util.concurrent.FutureTask.run(FutureTask.java:266)
              at io.jenkins.remoting.shaded.org.glassfish.tyrus.client.ClientManager$SameThreadExecutorService.execute(ClientManager.java:849)
              at java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:112)
              at io.jenkins.remoting.shaded.org.glassfish.tyrus.client.ClientManager.connectToServer(ClientManager.java:493)
              at io.jenkins.remoting.shaded.org.glassfish.tyrus.client.ClientManager.connectToServer(ClientManager.java:337)
              at hudson.remoting.Engine.runWebSocket(Engine.java:623)
              at hudson.remoting.Engine.run(Engine.java:470)
      Caused by: java.nio.channels.UnresolvedAddressException
              at sun.nio.ch.Net.checkAddress(Net.java:104)
              at sun.nio.ch.UnixAsynchronousSocketChannelImpl.implConnect(UnixAsynchronousSocketChannelImpl.java:302)
              at sun.nio.ch.AsynchronousSocketChannelImpl.connect(AsynchronousSocketChannelImpl.java:210)
              at io.jenkins.remoting.shaded.org.glassfish.tyrus.container.jdk.client.TransportFilter.handleConnect(TransportFilter.java:184)
              at io.jenkins.remoting.shaded.org.glassfish.tyrus.container.jdk.client.Filter.connect(Filter.java:80)
              at io.jenkins.remoting.shaded.org.glassfish.tyrus.container.jdk.client.Filter.connect(Filter.java:83)
              at io.jenkins.remoting.shaded.org.glassfish.tyrus.container.jdk.client.Filter.connect(Filter.java:83)
              at io.jenkins.remoting.shaded.org.glassfish.tyrus.container.jdk.client.ClientFilter.connect(ClientFilter.java:99)
              at io.jenkins.remoting.shaded.org.glassfish.tyrus.container.jdk.client.JdkClientContainer.connectSynchronously(JdkClientContainer.java:326)
              at io.jenkins.remoting.shaded.org.glassfish.tyrus.container.jdk.client.JdkClientContainer.access$700(JdkClientContainer.java:58)
              at io.jenkins.remoting.shaded.org.glassfish.tyrus.container.jdk.client.JdkClientContainer$1.call(JdkClientContainer.java:156)
              ... 12 more
      

      I already tried to downgrade Agent Remoting to 4.6 but the same issue remains.

        Attachments

          Activity

          Hide
          jthompson Jeff Thompson added a comment -

          UnresolvedAddressException is almost certain to be some sort of networking issue. Probably something about the docker configuration. Are these standard Jenkins-produced docker images? If so, something might have changed with the generation of the images. If you can isolate that, this could be filed against that part of the project.

          I recommend troubleshooting the networking and trying to figure out why the address doesn't resolve. I see that this is using websockets so it should be attempting to reach the standard HTTP/S port.

          I don't know of anything that has recently changed in this area that would cause an issue like this.

          Show
          jthompson Jeff Thompson added a comment - UnresolvedAddressException is almost certain to be some sort of networking issue. Probably something about the docker configuration. Are these standard Jenkins-produced docker images? If so, something might have changed with the generation of the images. If you can isolate that, this could be filed against that part of the project. I recommend troubleshooting the networking and trying to figure out why the address doesn't resolve. I see that this is using websockets so it should be attempting to reach the standard HTTP/S port. I don't know of anything that has recently changed in this area that would cause an issue like this.
          Hide
          felipecassiors Felipe Santos added a comment -

          I'm using my own image, https://github.com/felipecrs/jenkins-agent-dind. I'll try to switch to the official jnlp image (despite I'll have to extend it to add my certificates because of https://github.com/jenkinsci/kubernetes-plugin#using-websockets-with-a-jenkins-master-with-self-signed-https-certificate).

          Show
          felipecassiors Felipe Santos added a comment - I'm using my own image, https://github.com/felipecrs/jenkins-agent-dind . I'll try to switch to the official jnlp image (despite I'll have to extend it to add my certificates because of https://github.com/jenkinsci/kubernetes-plugin#using-websockets-with-a-jenkins-master-with-self-signed-https-certificate ).
          Hide
          felipecassiors Felipe Santos added a comment - - edited

          The exact same error happens using:

          Unable to find source-code formatter for language: dockerfile. Available languages are: actionscript, ada, applescript, bash, c, c#, c++, cpp, css, erlang, go, groovy, haskell, html, java, javascript, js, json, lua, none, nyan, objc, perl, php, python, r, rainbow, ruby, scala, sh, sql, swift, visualbasic, xml, yaml
          FROM jenkins/inbound-agent
          
          SHELL [ "/bin/bash", "-euxo", "pipefail", "-c" ]
          
          USER root
          
          # add ericsson certificates
          RUN curl -fsSLo /tmp/cert.pem https://<hidden>.crt; \
          	keytool -noprompt -storepass changeit \
            		-keystore "$JAVA_HOME/jre/lib/security/cacerts" \
            		-import -file /tmp/cert.pem -alias jenkinsMaster; \
            	rm -f /tmp/cert.pem
          
          USER jenkins
          

          Error:

          Unable to find source-code formatter for language: log. Available languages are: actionscript, ada, applescript, bash, c, c#, c++, cpp, css, erlang, go, groovy, haskell, html, java, javascript, js, json, lua, none, nyan, objc, perl, php, python, r, rainbow, ruby, scala, sh, sql, swift, visualbasic, xml, yaml
          ❯ kubectl logs test-8dshm
          Mar 15, 2021 11:03:10 PM hudson.remoting.jnlp.Main createEngine
          INFO: Setting up agent: test-8dshm
          Mar 15, 2021 11:03:10 PM hudson.remoting.jnlp.Main$CuiListener <init>
          INFO: Jenkins agent is running in headless mode.
          Mar 15, 2021 11:03:10 PM hudson.remoting.Engine startEngine
          INFO: Using Remoting version: 4.6
          Mar 15, 2021 11:03:10 PM org.jenkinsci.remoting.engine.WorkDirManager initializeWorkDir
          INFO: Using /home/jenkins/agent/remoting as a remoting work directory
          Mar 15, 2021 11:03:10 PM org.jenkinsci.remoting.engine.WorkDirManager setupLogging
          INFO: Both error and output logs will be printed to /home/jenkins/agent/remoting
          Mar 15, 2021 11:03:30 PM hudson.remoting.jnlp.Main$CuiListener error
          SEVERE: Connection failed.
          io.jenkins.remoting.shaded.javax.websocket.DeploymentException: Connection failed.
                  at io.jenkins.remoting.shaded.org.glassfish.tyrus.container.jdk.client.JdkClientContainer$1.call(JdkClientContainer.java:187)
                  at io.jenkins.remoting.shaded.org.glassfish.tyrus.container.jdk.client.JdkClientContainer$1.call(JdkClientContainer.java:107)
                  at io.jenkins.remoting.shaded.org.glassfish.tyrus.container.jdk.client.JdkClientContainer.openClientSocket(JdkClientContainer.java:192)
                  at io.jenkins.remoting.shaded.org.glassfish.tyrus.client.ClientManager$3$1.run(ClientManager.java:647)
                  at io.jenkins.remoting.shaded.org.glassfish.tyrus.client.ClientManager$3.run(ClientManager.java:696)
                  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
                  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
                  at io.jenkins.remoting.shaded.org.glassfish.tyrus.client.ClientManager$SameThreadExecutorService.execute(ClientManager.java:849)
                  at java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:112)
                  at io.jenkins.remoting.shaded.org.glassfish.tyrus.client.ClientManager.connectToServer(ClientManager.java:493)
                  at io.jenkins.remoting.shaded.org.glassfish.tyrus.client.ClientManager.connectToServer(ClientManager.java:337)
                  at hudson.remoting.Engine.runWebSocket(Engine.java:623)
                  at hudson.remoting.Engine.run(Engine.java:470)
          Caused by: java.nio.channels.UnresolvedAddressException
                  at sun.nio.ch.Net.checkAddress(Net.java:104)
                  at sun.nio.ch.UnixAsynchronousSocketChannelImpl.implConnect(UnixAsynchronousSocketChannelImpl.java:302)
                  at sun.nio.ch.AsynchronousSocketChannelImpl.connect(AsynchronousSocketChannelImpl.java:210)
                  at io.jenkins.remoting.shaded.org.glassfish.tyrus.container.jdk.client.TransportFilter.handleConnect(TransportFilter.java:184)
                  at io.jenkins.remoting.shaded.org.glassfish.tyrus.container.jdk.client.Filter.connect(Filter.java:80)
                  at io.jenkins.remoting.shaded.org.glassfish.tyrus.container.jdk.client.Filter.connect(Filter.java:83)
                  at io.jenkins.remoting.shaded.org.glassfish.tyrus.container.jdk.client.Filter.connect(Filter.java:83)
                  at io.jenkins.remoting.shaded.org.glassfish.tyrus.container.jdk.client.ClientFilter.connect(ClientFilter.java:99)
                  at io.jenkins.remoting.shaded.org.glassfish.tyrus.container.jdk.client.JdkClientContainer.connectSynchronously(JdkClientContainer.java:326)
                  at io.jenkins.remoting.shaded.org.glassfish.tyrus.container.jdk.client.JdkClientContainer.access$700(JdkClientContainer.java:58)
                  at io.jenkins.remoting.shaded.org.glassfish.tyrus.container.jdk.client.JdkClientContainer$1.call(JdkClientContainer.java:156)
                  ... 12 more
          

          This is my Kubernetes pod template:

          apiVersion: "v1"
           kind: "Pod"
           metadata: 
             labels: 
               jenkins: "slave"
               jenkins/label-digest: "87f8ed9157125ffc4da9e06a7b8011ad80a53fe1"
               jenkins/label: "test"
             name: "test-xc38h"
           spec: 
             containers: 
             - env: 
               - name: "JENKINS_SECRET"
                 value: "********"
               - name: "JENKINS_AGENT_NAME"
                 value: "test-xc38h"
               - name: "JENKINS_WEB_SOCKET"
                 value: "true"
               - name: "JENKINS_NAME"
                 value: "test-xc38h"
               - name: "JENKINS_AGENT_WORKDIR"
                 value: "/home/jenkins/agent"
               - name: "JENKINS_URL"
                 value: "https://<hidden>/jenkins/"
               image: "<hidden>/jenkins-inbound-agent"
               imagePullPolicy: "Always"
               name: "jnlp"
               resources: 
                 limits: {}
                 requests: {}
               tty: false
               volumeMounts: 
               - mountPath: "/home/jenkins/agent"
                 name: "workspace-volume"
                 readOnly: false
               workingDir: "/home/jenkins/agent"
             hostNetwork: false
             nodeSelector: 
               kubernetes.io/os: "linux"
             restartPolicy: "Never"
             volumes: 
             - emptyDir: 
                 medium: ""
               name: "workspace-volume"
          

          And yes, I'm using WebSockets as you said.

          Show
          felipecassiors Felipe Santos added a comment - - edited The exact same error happens using: Unable to find source-code formatter for language: dockerfile. Available languages are: actionscript, ada, applescript, bash, c, c#, c++, cpp, css, erlang, go, groovy, haskell, html, java, javascript, js, json, lua, none, nyan, objc, perl, php, python, r, rainbow, ruby, scala, sh, sql, swift, visualbasic, xml, yaml FROM jenkins/inbound-agent SHELL [ "/bin/bash" , "-euxo" , "pipefail" , "-c" ] USER root # add ericsson certificates RUN curl -fsSLo /tmp/cert.pem https: //<hidden>.crt; \ keytool -noprompt -storepass changeit \ -keystore "$JAVA_HOME/jre/lib/security/cacerts" \ - import -file /tmp/cert.pem -alias jenkinsMaster; \ rm -f /tmp/cert.pem USER jenkins Error: Unable to find source-code formatter for language: log. Available languages are: actionscript, ada, applescript, bash, c, c#, c++, cpp, css, erlang, go, groovy, haskell, html, java, javascript, js, json, lua, none, nyan, objc, perl, php, python, r, rainbow, ruby, scala, sh, sql, swift, visualbasic, xml, yaml ❯ kubectl logs test-8dshm Mar 15, 2021 11:03:10 PM hudson.remoting.jnlp.Main createEngine INFO: Setting up agent: test-8dshm Mar 15, 2021 11:03:10 PM hudson.remoting.jnlp.Main$CuiListener <init> INFO: Jenkins agent is running in headless mode. Mar 15, 2021 11:03:10 PM hudson.remoting.Engine startEngine INFO: Using Remoting version: 4.6 Mar 15, 2021 11:03:10 PM org.jenkinsci.remoting.engine.WorkDirManager initializeWorkDir INFO: Using /home/jenkins/agent/remoting as a remoting work directory Mar 15, 2021 11:03:10 PM org.jenkinsci.remoting.engine.WorkDirManager setupLogging INFO: Both error and output logs will be printed to /home/jenkins/agent/remoting Mar 15, 2021 11:03:30 PM hudson.remoting.jnlp.Main$CuiListener error SEVERE: Connection failed. io.jenkins.remoting.shaded.javax.websocket.DeploymentException: Connection failed. at io.jenkins.remoting.shaded.org.glassfish.tyrus.container.jdk.client.JdkClientContainer$1.call(JdkClientContainer.java:187) at io.jenkins.remoting.shaded.org.glassfish.tyrus.container.jdk.client.JdkClientContainer$1.call(JdkClientContainer.java:107) at io.jenkins.remoting.shaded.org.glassfish.tyrus.container.jdk.client.JdkClientContainer.openClientSocket(JdkClientContainer.java:192) at io.jenkins.remoting.shaded.org.glassfish.tyrus.client.ClientManager$3$1.run(ClientManager.java:647) at io.jenkins.remoting.shaded.org.glassfish.tyrus.client.ClientManager$3.run(ClientManager.java:696) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at io.jenkins.remoting.shaded.org.glassfish.tyrus.client.ClientManager$SameThreadExecutorService.execute(ClientManager.java:849) at java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:112) at io.jenkins.remoting.shaded.org.glassfish.tyrus.client.ClientManager.connectToServer(ClientManager.java:493) at io.jenkins.remoting.shaded.org.glassfish.tyrus.client.ClientManager.connectToServer(ClientManager.java:337) at hudson.remoting.Engine.runWebSocket(Engine.java:623) at hudson.remoting.Engine.run(Engine.java:470) Caused by: java.nio.channels.UnresolvedAddressException at sun.nio.ch.Net.checkAddress(Net.java:104) at sun.nio.ch.UnixAsynchronousSocketChannelImpl.implConnect(UnixAsynchronousSocketChannelImpl.java:302) at sun.nio.ch.AsynchronousSocketChannelImpl.connect(AsynchronousSocketChannelImpl.java:210) at io.jenkins.remoting.shaded.org.glassfish.tyrus.container.jdk.client.TransportFilter.handleConnect(TransportFilter.java:184) at io.jenkins.remoting.shaded.org.glassfish.tyrus.container.jdk.client.Filter.connect(Filter.java:80) at io.jenkins.remoting.shaded.org.glassfish.tyrus.container.jdk.client.Filter.connect(Filter.java:83) at io.jenkins.remoting.shaded.org.glassfish.tyrus.container.jdk.client.Filter.connect(Filter.java:83) at io.jenkins.remoting.shaded.org.glassfish.tyrus.container.jdk.client.ClientFilter.connect(ClientFilter.java:99) at io.jenkins.remoting.shaded.org.glassfish.tyrus.container.jdk.client.JdkClientContainer.connectSynchronously(JdkClientContainer.java:326) at io.jenkins.remoting.shaded.org.glassfish.tyrus.container.jdk.client.JdkClientContainer.access$700(JdkClientContainer.java:58) at io.jenkins.remoting.shaded.org.glassfish.tyrus.container.jdk.client.JdkClientContainer$1.call(JdkClientContainer.java:156) ... 12 more This is my Kubernetes pod template: apiVersion: "v 1 " kind: "Pod" metadata: labels: jenkins: "slave" jenkins/label- digest: " 87 f 8 ed 9157125 ffc 4 da 9 e 06 a 7 b 8011 ad 80 a 53 fe 1 " jenkins/ label: "test" name: "test-xc 38 h" spec: containers: - env: - name: "JENKINS_SECRET" value: "********" - name: "JENKINS_AGENT_NAME" value: "test-xc 38 h" - name: "JENKINS_WEB_SOCKET" value: "true" - name: "JENKINS_NAME" value: "test-xc 38 h" - name: "JENKINS_AGENT_WORKDIR" value: "/home/jenkins/agent" - name: "JENKINS_URL" value: "https://<hidden>/jenkins/" image: "<hidden>/jenkins-inbound-agent" imagePullPolicy: "Always" name: "jnlp" resources: limits: {} requests: {} tty: false volumeMounts: - mountPath: "/home/jenkins/agent" name: "workspace-volume" readOnly: false workingDir: "/home/jenkins/agent" hostNetwork: false nodeSelector: kubernetes.io/ os: "linux" restartPolicy: "Never" volumes: - emptyDir: medium: "" name: "workspace-volume" And yes, I'm using WebSockets as you said.
          Hide
          felipecassiors Felipe Santos added a comment -

          Ok, thank you so much for your help. This is indeed a network issue here, my cluster seems to be experiencing some DNS issues. I've been trying to fix this issue since last Thursday, that's why I opened this issue here because I thought I exhausted all my options.

          Show
          felipecassiors Felipe Santos added a comment - Ok, thank you so much for your help. This is indeed a network issue here, my cluster seems to be experiencing some DNS issues. I've been trying to fix this issue since last Thursday, that's why I opened this issue here because I thought I exhausted all my options.
          Hide
          felipecassiors Felipe Santos added a comment -

          This is a network issue in our cluster.

          Show
          felipecassiors Felipe Santos added a comment - This is a network issue in our cluster.
          Hide
          jthompson Jeff Thompson added a comment -

          I'm glad you were able to figure it out. Thanks for reporting back and closing this ticket.

          Show
          jthompson Jeff Thompson added a comment - I'm glad you were able to figure it out. Thanks for reporting back and closing this ticket.

            People

            Assignee:
            jthompson Jeff Thompson
            Reporter:
            felipecassiors Felipe Santos
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved: