Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-28759

Batch steps on slaves randomly hang when complete

    XMLWordPrintable

Details

    Description

      Batch steps that succeed are hanging, more frequently since the upgrade to Jenkins 1.6.16 + WF 1.7; I think this is recent, I do not recall encountering such issues with Jenkins 1.6.09 + WF 1.5. This is highly problematic for workflow scripts that rely on large numbers of batch steps. Note, the slave nodes in question may be considered "high-latency" with response times occasionally in seconds.

      Reproduced 2 out of 4 times using the following test idiom, increasing the below loop to 1000 will probably make it a 100% reproduction, parallelizing anecdotally seems to increase reproduction:

      node('slave') {
        for( int i = 0; i < 100; ++i ) {
          echo "i=${i}"
          bat *<some batch step that takes variable time to run, eg scm or make>*
        }
      }
      

      Attachments

        Issue Links

          Activity

            gijskuijer Gijs Kuijer added a comment -

            This one is probably related to this issue: https://issues.jenkins-ci.org/browse/JENKINS-34150

            gijskuijer Gijs Kuijer added a comment - This one is probably related to this issue: https://issues.jenkins-ci.org/browse/JENKINS-34150
            gijskuijer Gijs Kuijer added a comment - - edited

            https://issues.jenkins-ci.org/browse/JENKINS-34150 probably resolves this issue as well.

            gijskuijer Gijs Kuijer added a comment - - edited https://issues.jenkins-ci.org/browse/JENKINS-34150 probably resolves this issue as well.
            jglick Jesse Glick added a comment -

            Any known way to reproduce from scratch? There are a lot of related issues which are all probably duplicates, but it is unclear what the trigger conditions are.

            jglick Jesse Glick added a comment - Any known way to reproduce from scratch? There are a lot of related issues which are all probably duplicates, but it is unclear what the trigger conditions are.
            mcrooney mcrooney added a comment - - edited

            I wonder if this is specific to batch steps, or a general Pipeline issue, as we rarely but regularly see Pipeline hang indefinitely after shell steps are finished at:

            + exit 0
            

            They always have to be hard-killed:

            + exit 0
            Aborted by Example User
            Click here to forcibly terminate running steps
            Terminating stage
            Click here to forcibly kill entire build
            Hard kill!
            Finished: ABORTED
            

            Would this be a different bug?

            mcrooney mcrooney added a comment - - edited I wonder if this is specific to batch steps, or a general Pipeline issue, as we rarely but regularly see Pipeline hang indefinitely after shell steps are finished at: + exit 0 They always have to be hard-killed: + exit 0 Aborted by Example User Click here to forcibly terminate running steps Terminating stage Click here to forcibly kill entire build Hard kill! Finished: ABORTED Would this be a different bug?
            drazul Daniel Aguado Araujo added a comment - - edited

            Workaround: run those steps with powershell

             

            I'm affected by this bug from few days after some changes on my VM builders. I use swarm client.

            drazul Daniel Aguado Araujo added a comment - - edited Workaround: run those steps with powershell   I'm affected by this bug from few days after some changes on my VM builders. I use swarm client.

            People

              Unassigned Unassigned
              sumdumgai A C
              Votes:
              19 Vote for this issue
              Watchers:
              29 Start watching this issue

              Dates

                Created:
                Updated: