Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-49707

Auto retry for elastic agents after channel closure

XMLWordPrintable

      While my pipeline was running, the node that was executing logic terminated. I see this at the bottom of my console output:

      Cannot contact ip-172-31-242-8.us-west-2.compute.internal: java.io.IOException: remote file operation failed: /ebs/jenkins/workspace/common-pipelines-nodeploy at hudson.remoting.Channel@48503f20:ip-172-31-242-8.us-west-2.compute.internal: hudson.remoting.ChannelClosedException: Channel "unknown": Remote call on ip-172-31-242-8.us-west-2.compute.internal failed. The channel is closing down or has closed down
      

      There's a spinning arrow below it.

      I have a cron script that uses the Jenkins master CLI to remove nodes which have stopped responding. When I examine this node's page in my Jenkins website, it looks like the node is still running that job and i see an orange label that says "Feb 22, 2018 5:16:02 PM Node is being removed".

      I'm wondering what would be a better way to say "If the channel closes down, retry the work on another node with the same label?

      Things seem stuck. Please advise.

        1. image-2018-02-22-17-27-31-541.png
          image-2018-02-22-17-27-31-541.png
          56 kB
        2. image-2018-02-22-17-28-03-053.png
          image-2018-02-22-17-28-03-053.png
          30 kB
        3. jenkins_agent_devbuild9_remoting_logs.zip
          4 kB
        4. jenkins_Agent_devbuild9_System_Information.html
          66 kB
        5. jenkins_agents_Thread_dump.html
          172 kB
        6. jenkins_support_2018-06-29_01.14.18.zip
          1.26 MB
        7. jobConsoleOutput.txt
          12 kB
        8. JavaMelodyNodeGrubThreads_4_07_18.pdf
          9 kB
        9. MonitoringJavaelodyOnNodes.html
          44 kB
        10. grub.remoting.logs.zip
          3 kB
        11. jobConsoleOutput.txt
          12 kB
        12. JavaMelodyGrubHeapDump_4_07_18.pdf
          220 kB
        13. NetworkAndMachineStats.png
          NetworkAndMachineStats.png
          224 kB
        14. Thread dump [Jenkins].html
          219 kB
        15. grubSystemInformation.html
          67 kB
        16. slaveLogInMaster.grub.zip
          8 kB
        17. jenkins.log
          984 kB
        18. support_2018-07-04_07.35.22.zip
          956 kB
        19. threadDump.txt
          98 kB

            jglick Jesse Glick
            piratejohnny Jon B
            Votes:
            37 Vote for this issue
            Watchers:
            54 Start watching this issue

              Created:
              Updated:
              Resolved: