Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-59259

sshAgent step does not survive agent disconnect/reconnect

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Minor Minor
    • ssh-agent-plugin
    • None
    • Jenkins 2.176.2
      ssh-agent:1.17
      durable-task: 1.30
      workflow-durable-task-step:2:32
    • 1.19

      The following pipeline code fails if the connection to the agent is lost and reconnected during execution of the shell script:

      node {
          sshagent(['someCredential']) {
              sh '''
      for i in `seq 1 120`;
      do
          sleep 1
          echo $i
      done
      '''
              echo "Done"
          }
          echo "Done"
      }
      

      The build log is as follows:

      Started by user Jon
      Running in Durability level: MAX_SURVIVABILITY
      [Pipeline] Start of Pipeline
      [Pipeline] node
      Running on build-srv123 in /some/path/workspace/job/path
      [Pipeline] {
      [Pipeline] sshagent
      [ssh-agent] Using credentials someCredential
      [ssh-agent] Looking for ssh-agent implementation...
      [ssh-agent]   Exec ssh-agent (binary ssh-agent on a remote machine)
      $ ssh-agent
      SSH_AUTH_SOCK=/tmp/ssh-gDb94Onc5Loa/agent.30438
      SSH_AGENT_PID=30440
      Running ssh-add (command line suppressed)
      Identity added: /some/path/workspace/job/path-tmp/private_key_970164985924018321.key (/some/path/workspace/job/path-tmp/private_key_970164985924018321.key)
      [ssh-agent] Started.
      [Pipeline] {
      [Pipeline] sh
      + seq 1 120
      + sleep 1
      + echo 1
      1
      + sleep 1
      + echo 2
      2
      + sleep 1
      + echo 3
      3
      + sleep 1
      + echo 4
      4
      + sleep 1
      + echo 5
      5
      + sleep 1
      Cannot contact build-srv123: hudson.remoting.ChannelClosedException: Channel "unknown": Remote call on build-srv123 failed. The channel is closing down or has closed down
      + echo 6
      6
      + sleep 1
      + echo 7
      7
      + sleep 1
      + echo 8
      8
      ... truncated for your reading sanity...
      + sleep 1
      + echo 119
      119
      + sleep 1
      + echo 120
      120
      [Pipeline] echo
      Done
      [Pipeline] }
      [Pipeline] // sshagent
      [Pipeline] }
      [Pipeline] // node
      [Pipeline] End of Pipeline
      [BFA] Scanning build for known causes...
      [BFA] No failure causes found
      [BFA] Done. 0s
      java.io.EOFException
      	at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2681)
      	at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:3156)
      	at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:862)
      	at java.io.ObjectInputStream.<init>(ObjectInputStream.java:358)
      	at hudson.remoting.ObjectInputStreamEx.<init>(ObjectInputStreamEx.java:49)
      	at hudson.remoting.Command.readFrom(Command.java:140)
      	at hudson.remoting.Command.readFrom(Command.java:126)
      	at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:36)
      	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:63)
      Caused: java.io.IOException: Unexpected termination of the channel
      	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:77)
      Caused: hudson.remoting.ChannelClosedException: Channel "unknown": Remote call on build-srv123 failed. The channel is closing down or has closed down
      	at hudson.remoting.Channel.call(Channel.java:950)
      	at hudson.Launcher$RemoteLauncher.launch(Launcher.java:1060)
      	at hudson.Launcher$ProcStarter.start(Launcher.java:455)
      	at com.cloudbees.jenkins.plugins.sshagent.exec.ExecRemoteAgent.stop(ExecRemoteAgent.java:130)
      	at com.cloudbees.jenkins.plugins.sshagent.SSHAgentStepExecution.cleanUp(SSHAgentStepExecution.java:198)
      	at com.cloudbees.jenkins.plugins.sshagent.SSHAgentStepExecution.access$000(SSHAgentStepExecution.java:24)
      	at com.cloudbees.jenkins.plugins.sshagent.SSHAgentStepExecution$Callback.finished(SSHAgentStepExecution.java:108)
      	at org.jenkinsci.plugins.workflow.steps.BodyExecutionCallback$TailCall.onSuccess(BodyExecutionCallback.java:118)
      	at org.jenkinsci.plugins.workflow.cps.CpsBodyExecution$SuccessAdapter.receive(CpsBodyExecution.java:377)
      	at com.cloudbees.groovy.cps.impl.ContinuationGroup.methodCall(ContinuationGroup.java:87)
      	at com.cloudbees.groovy.cps.impl.FunctionCallBlock$ContinuationImpl.dispatchOrArg(FunctionCallBlock.java:113)
      	at com.cloudbees.groovy.cps.impl.FunctionCallBlock$ContinuationImpl.fixArg(FunctionCallBlock.java:83)
      	at sun.reflect.GeneratedMethodAccessor1513.invoke(Unknown Source)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at java.lang.reflect.Method.invoke(Method.java:498)
      	at com.cloudbees.groovy.cps.impl.ContinuationPtr$ContinuationImpl.receive(ContinuationPtr.java:72)
      	at com.cloudbees.groovy.cps.impl.ConstantBlock.eval(ConstantBlock.java:21)
      	at com.cloudbees.groovy.cps.Next.step(Next.java:83)
      	at com.cloudbees.groovy.cps.Continuable$1.call(Continuable.java:174)
      	at com.cloudbees.groovy.cps.Continuable$1.call(Continuable.java:163)
      	at org.codehaus.groovy.runtime.GroovyCategorySupport$ThreadCategoryInfo.use(GroovyCategorySupport.java:129)
      	at org.codehaus.groovy.runtime.GroovyCategorySupport.use(GroovyCategorySupport.java:268)
      	at com.cloudbees.groovy.cps.Continuable.run0(Continuable.java:163)
      	at org.jenkinsci.plugins.workflow.cps.CpsThread.runNextChunk(CpsThread.java:186)
      	at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.run(CpsThreadGroup.java:370)
      	at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.access$200(CpsThreadGroup.java:93)
      	at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:282)
      	at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:270)
      	at org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService$2.call(CpsVmExecutorService.java:66)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      	at hudson.remoting.SingleLaneExecutorService$1.run(SingleLaneExecutorService.java:131)
      	at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
      	at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:59)
      	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
      	at java.lang.Thread.run(Thread.java:748)
      Finished: FAILURE
      

      As can be seen in the log, the connection fails, but once the agent comes back online the build log is fetched and continuously pulled until the shell script finishes on the agent. Once the shell script is done SSHAgent throws exception in its teardown phase.

      This behavior is different from when the master restarts during ongoing build, in that situation SSHAgent step works as expected. Once the agent comes back online ssh-agent plugin runs necessary commands to restores operation:

      ...
      + sleep 1
      + echo 17
      17
      + sleep 1
      Resuming build at Fri Sep 06 08:27:37 CEST 2019 after Jenkins restart
      Waiting to resume part of some ยป job #15: Finished waiting
      Ready to run at Fri Sep 06 08:27:40 CEST 2019
      [ssh-agent] Using credentials someCredential
      [ssh-agent] Looking for ssh-agent implementation...
      [ssh-agent]   Exec ssh-agent (binary ssh-agent on a remote machine)
      $ ssh-agent
      SSH_AUTH_SOCK=/tmp/ssh-wNzzYEFltayP/agent.10155
      SSH_AGENT_PID=10157
      Running ssh-add (command line suppressed)
      Identity added: /some/path/workspace/job/path-tmp/private_key_155210530577449396.key (/some/path/workspace/job/path-tmp/private_key_155210530577449396.key)
      [ssh-agent] Started.
      + echo 18
      18
      + sleep 1
      + echo 19
      19
      + sleep 1
      + echo 20
      ....
      

            jons Jon Sten
            jons Jon Sten
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: