Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-49707

Auto retry for elastic agents after channel closure


      While my pipeline was running, the node that was executing logic terminated. I see this at the bottom of my console output:

      Cannot contact ip-172-31-242-8.us-west-2.compute.internal: java.io.IOException: remote file operation failed: /ebs/jenkins/workspace/common-pipelines-nodeploy at hudson.remoting.Channel@48503f20:ip-172-31-242-8.us-west-2.compute.internal: hudson.remoting.ChannelClosedException: Channel "unknown": Remote call on ip-172-31-242-8.us-west-2.compute.internal failed. The channel is closing down or has closed down

      There's a spinning arrow below it.

      I have a cron script that uses the Jenkins master CLI to remove nodes which have stopped responding. When I examine this node's page in my Jenkins website, it looks like the node is still running that job and i see an orange label that says "Feb 22, 2018 5:16:02 PM Node is being removed".

      I'm wondering what would be a better way to say "If the channel closes down, retry the work on another node with the same label?

      Things seem stuck. Please advise.

        1. grubSystemInformation.html
          67 kB
          Federico Naum
        2. image-2018-02-22-17-27-31-541.png
          56 kB
          Jon B
        3. image-2018-02-22-17-28-03-053.png
          30 kB
          Jon B
        4. JavaMelodyGrubHeapDump_4_07_18.pdf
          220 kB
          Federico Naum
        5. JavaMelodyNodeGrubThreads_4_07_18.pdf
          9 kB
          Federico Naum
        6. jenkins_Agent_devbuild9_System_Information.html
          66 kB
          Federico Naum
        7. jenkins_agents_Thread_dump.html
          172 kB
          Federico Naum
        8. jenkins.log
          984 kB
          Federico Naum
        9. jobConsoleOutput.txt
          12 kB
          Federico Naum
        10. jobConsoleOutput.txt
          12 kB
          Federico Naum
        11. MonitoringJavaelodyOnNodes.html
          44 kB
          Federico Naum
        12. NetworkAndMachineStats.png
          224 kB
          Federico Naum
        13. threadDump.txt
          98 kB
          Amir Barkal
        14. Thread dump [Jenkins].html
          219 kB
          Federico Naum

            jglick Jesse Glick
            piratejohnny Jon B
            37 Vote for this issue
            54 Start watching this issue