Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-64910

Hang when using ssh after upgrade to ssh-agent 1.21

    • Icon: Bug Bug
    • Resolution: Won't Fix
    • Icon: Blocker Blocker
    • ssh-agent-plugin
    • None
    • Jenkins version 2.280
      ssh-agent 1.21
      ssh-credentials 1.18.1

      After upgrading to ssh-agent 1.21 we had a job hang at a git command.

      We terminated it after more than 30 minutes. Tried again and again with same result.

      After reverting to 1.20 it worked again.

      Versions:

      • Jenkins 2.280
      • ssh-agent 1.21
      • ssh-credentials 1.18.1

      Log:

      ...
      [Pipeline] sshagent
      15:27:55  [ssh-agent] Using credentials jenkins (jenkins ssh key)
      15:27:55  [ssh-agent] Looking for ssh-agent implementation...
      15:27:55  [ssh-agent]   Exec ssh-agent (binary ssh-agent on a remote machine)
      15:27:55  $ docker exec bb2595d46378e3aa4279cba0a20d8e3026f79d84bc9d2f2b5b856bbfba3ab1f5 ssh-agent
      15:27:55  [ssh-agent]   Java/JNR ssh-agent
      15:27:55  [ssh-agent] Registered BouncyCastle on the remote agent
      15:27:55  [ssh-agent] Started.
      [Pipeline] {
      [Pipeline] sh
      15:27:55  + git ls-remote --heads origin
      

       

       Probably related to JENKINS-50181.

          [JENKINS-64910] Hang when using ssh after upgrade to ssh-agent 1.21

          Jesse Glick added a comment -

          From your log, you are using a deprecated in-Java implementation. Make sure the ssh-agent command is installed on your build node. See https://github.com/jenkinsci/ssh-agent-plugin/pull/48

          Jesse Glick added a comment - From your log, you are using a deprecated in-Java implementation. Make sure the ssh-agent command is installed on your build node. See https://github.com/jenkinsci/ssh-agent-plugin/pull/48

          The build is using a docker agent with the image python:3.8

          agent {
              docker {
                  image 'python:3.8'
                  args '-u root:root'
              }
          }
          

          As far as I can tell, that image has ssh-agent installed:

          $ docker run --rm -it python:3.8 ssh-agent
          SSH_AUTH_SOCK=/tmp/ssh-8QKJYPNyxAfB/agent.1; export SSH_AUTH_SOCK;
          SSH_AGENT_PID=8; export SSH_AGENT_PID;
          echo Agent pid 8;
          

          All hosts that run the Jenkins agents also have ssh-agent installed, if that matters.

          Is there any way to get more info about why the Java-implementation was used and not the exec-based ssh-agent?

          Robin Karlsson added a comment - The build is using a docker agent with the image python:3.8 agent { docker { image 'python:3.8' args '-u root:root' } } As far as I can tell, that image has ssh-agent installed: $ docker run --rm -it python:3.8 ssh-agent SSH_AUTH_SOCK=/tmp/ssh-8QKJYPNyxAfB/agent.1; export SSH_AUTH_SOCK; SSH_AGENT_PID=8; export SSH_AGENT_PID; echo Agent pid 8; All hosts that run the Jenkins agents also have ssh-agent installed, if that matters. Is there any way to get more info about why the Java-implementation was used and not the exec-based ssh-agent ?

          Mathieu added a comment -

          I confirm the ssh-agent is also installed in the docker image used to build on my side.

          This agent also works correctly: if I copy the SSH key in the docker instance, launch the agent and then try to do a git clone or connect to a SSH remote, everything works fine.

          Mathieu added a comment - I confirm the ssh-agent is also installed in the docker image used to build on my side. This agent also works correctly: if I copy the SSH key in the docker instance, launch the agent and then try to do a git clone or connect to a SSH remote, everything works fine.

          Jesse Glick added a comment -

          mbriand snago see JENKINS-43050; you cannot mix this plugin with docker-workflow. Frankly I recommend using neither.

          Jesse Glick added a comment - mbriand snago see JENKINS-43050 ; you cannot mix this plugin with docker-workflow . Frankly I recommend using neither.

          Vittorio added a comment -

          jglick Thank you very much for your comment. I have solved the issue by installing the `ssh-agent` command in the container running the pipeline's stage, which uses the SSH Agent plugin.

          So my pipeline is now working with the version 1.21.

          Vittorio added a comment - jglick  Thank you very much for your comment. I have solved the issue by installing the `ssh-agent` command in the container running the pipeline's stage, which uses the SSH Agent plugin. So my pipeline is now working with the version 1.21.

          Jesse Glick added a comment -

          Good to know. So at least some cases of this symptom may be “resolved” by https://github.com/jenkinsci/ssh-agent-plugin/pull/48 to the extent that the build would immediately fail for a clearer reason: a missing executable.

          Jesse Glick added a comment - Good to know. So at least some cases of this symptom may be “resolved” by https://github.com/jenkinsci/ssh-agent-plugin/pull/48 to the extent that the build would immediately fail for a clearer reason: a missing executable.

          We managed to refactor our pipeline to do the sshagent steps in a separate stage on a non-docker node. That works fine both before and after upgrading to 1.21.

          So as far as I'm concerned this issue could be closed (maybe as "Won't Fix" if you use that).

          Robin Karlsson added a comment - We managed to refactor our pipeline to do the sshagent steps in a separate stage on a non-docker node. That works fine both before and after upgrading to 1.21. So as far as I'm concerned this issue could be closed (maybe as "Won't Fix" if you use that).

          Mathieu added a comment -

          OK, I finally managed to find the issue on my side, not directly related to the ssh-agent binary actually.

          The issue was the plugin tries to run a `docker exec` command to run the agent, but the command was failing because it failed to connect to the docker service. (Agent is not accessible on /var/run/docker.sock on my setup, so it was just a matter of setting a correct value for DOCKER_HOST).

          Maybe it could be useful to print the failure log for the ssh-agent command in Jenkins build log ? I'm not really familiar with Java, but here is what I used for debug:

          diff --git i/src/main/java/com/cloudbees/jenkins/plugins/sshagent/exec/ExecRemoteAgent.java w/src/main/java/com/cloudbees/jenkins/plugins/sshagent/exec/ExecRemoteAgent.java
          index 93b59af1e316..8cf8e5a7a797 100644
          --- i/src/main/java/com/cloudbees/jenkins/plugins/sshagent/exec/ExecRemoteAgent.java
          +++ w/src/main/java/com/cloudbees/jenkins/plugins/sshagent/exec/ExecRemoteAgent.java
          @@ -63,7 +63,8 @@ public class ExecRemoteAgent implements RemoteAgent {
                   ByteArrayOutputStream baos = new ByteArrayOutputStream();
                   if (launcherProvider.getLauncher().launch().cmds("ssh-agent").stdout(baos).start()
                           .joinWithTimeout(1, TimeUnit.MINUTES, listener) != 0) {
          -            throw new AbortException("Failed to run ssh-agent");
          +            String reason = new String(baos.toByteArray(), StandardCharsets.US_ASCII);
          +            throw new AbortException("Failed to run ssh-agent: " + reason);
                   }
                   agentEnv = parseAgentEnv(new String(baos.toByteArray(), StandardCharsets.US_ASCII), listener); // TODO could include local filenames, better to look up remote charset
          
          

          Mathieu added a comment - OK, I finally managed to find the issue on my side, not directly related to the ssh-agent binary actually. The issue was the plugin tries to run a `docker exec` command to run the agent, but the command was failing because it failed to connect to the docker service. (Agent is not accessible on /var/run/docker.sock on my setup, so it was just a matter of setting a correct value for DOCKER_HOST). Maybe it could be useful to print the failure log for the ssh-agent command in Jenkins build log ? I'm not really familiar with Java, but here is what I used for debug: diff --git i/src/main/java/com/cloudbees/jenkins/plugins/sshagent/exec/ExecRemoteAgent.java w/src/main/java/com/cloudbees/jenkins/plugins/sshagent/exec/ExecRemoteAgent.java index 93b59af1e316..8cf8e5a7a797 100644 --- i/src/main/java/com/cloudbees/jenkins/plugins/sshagent/exec/ExecRemoteAgent.java +++ w/src/main/java/com/cloudbees/jenkins/plugins/sshagent/exec/ExecRemoteAgent.java @@ -63,7 +63,8 @@ public class ExecRemoteAgent implements RemoteAgent { ByteArrayOutputStream baos = new ByteArrayOutputStream(); if (launcherProvider.getLauncher().launch().cmds( "ssh-agent" ).stdout(baos).start() .joinWithTimeout(1, TimeUnit.MINUTES, listener) != 0) { - throw new AbortException( "Failed to run ssh-agent" ); + String reason = new String (baos.toByteArray(), StandardCharsets.US_ASCII); + throw new AbortException( "Failed to run ssh-agent: " + reason); } agentEnv = parseAgentEnv( new String (baos.toByteArray(), StandardCharsets.US_ASCII), listener); // TODO could include local filenames, better to look up remote charset

          Jesse Glick added a comment -

          mbriand sounds right as a diagnostic; would you mind opening a pull request to that effect?

          Jesse Glick added a comment - mbriand sounds right as a diagnostic; would you mind opening a pull request to that effect?

          Mathieu added a comment -

          Mathieu added a comment - jglick ok, I just filled https://github.com/jenkinsci/ssh-agent-plugin/pull/50 .

            Unassigned Unassigned
            snago Robin Karlsson
            Votes:
            2 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: