• Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Critical Critical
    • core
    • Jenkins 2.150
      Windows 7
      cygwin

      After executing successfully the shell script the workers remain stuck for 10 minutes on the final "exit 0".

      There hasn't been any other failure that I could find: all the jobs run exactly as planned, they just don't seem to exit.

      The fact that the jobs remain stuck for exactly 600 seconds makes me think of a timeout of some sort.

      Reverting to 2.138 fixed the issue, that's why I am marking it as a regression.

          [JENKINS-55106] Build stuck on final "exit 0"

          Guru Vamsi Chintala added a comment - - edited

          We are seeing the same issue in regular builds and pull requests, build stucks on exit 0 for more than 5 minutes and reports the status.

          Guru Vamsi Chintala added a comment - - edited We are seeing the same issue in regular builds and pull requests, build stucks on exit 0 for more than 5 minutes and reports the status.

          Sean Kline added a comment -

          We are seeing this issue as well on version 2.150.1 running on Windows Server 2012 R2. Builds that took 4 minutes prior to the upgrade were taking 18 minutes afterward. We have reverted to version 2.138.3, which resolved the issue.

          If there's information that I can provide to help pin this down, please let me know.

          Sean Kline added a comment - We are seeing this issue as well on version 2.150.1 running on Windows Server 2012 R2. Builds that took 4 minutes prior to the upgrade were taking 18 minutes afterward. We have reverted to version 2.138.3, which resolved the issue. If there's information that I can provide to help pin this down, please let me know.

          Reverting the jenkins version to 2.138.3 fixed the issue. Hope it is fixed in next Jenkins LTS version.

          Thank you  Sean.

          Guru Vamsi Chintala added a comment - Reverting the jenkins version to 2.138.3 fixed the issue. Hope it is fixed in next Jenkins LTS version. Thank you  Sean.

          Anatoly Shirokov added a comment - - edited

          Confirmed. The same issue with 2.150.1 on Windows Server 2003, JDK 8. As you see exactly 10 minutes before the finish:

           

           19:59:42 D:\Jenkins\jobs\product\workspace>echo done 
           19:59:42 done
           19:59:42 
           19:59:42 D:\Jenkins\jobs\product\workspace>exit 0 
           20:09:44 Finished: SUCCESS
          

          We have reverted to the 2.138.2 version.

           

           

          Anatoly Shirokov added a comment - - edited Confirmed. The same issue with 2.150.1 on Windows Server 2003, JDK 8. As you see exactly 10 minutes before the finish:   19:59:42 D:\Jenkins\jobs\product\workspace>echo done 19:59:42 done 19:59:42 19:59:42 D:\Jenkins\jobs\product\workspace>exit 0 20:09:44 Finished: SUCCESS We have reverted to the 2.138.2 version.    

          Shawn Baker added a comment -

          I can also confirmed this on Windows Server 2012 R2, JDK 8 on Jenkins 2.150.2.  After the Build portion of the configuration has completed, there is a 10 minute delay before the Post-build Actions begin.

          I reverted back to 2.138.4.

          Shawn Baker added a comment - I can also confirmed this on Windows Server 2012 R2, JDK 8 on Jenkins 2.150.2.  After the Build portion of the configuration has completed, there is a 10 minute delay before the Post-build Actions begin. I reverted back to 2.138.4.

          I tried upgrading my instance to 2.164 and I can still reproduce it. I'll revert again to 2.138 for the moment.

          I'll lose access to this instance soon (~3 weeks) so if anyone needs me to try stuff, now's the time.

          Michele Ippolito added a comment - I tried upgrading my instance to 2.164 and I can still reproduce it. I'll revert again to 2.138 for the moment. I'll lose access to this instance soon (~3 weeks) so if anyone needs me to try stuff, now's the time.

          Added "-DSoftKillWaitSeconds=0 " in jenkins.xml before the -jar option. Now jobs execute normally with 2.150.2 version
          Reference: https://stackoverflow.com/questions/54039226/jenkins-hangs-between-build-and-post-build/54072987#54072987
                           https://issues.jenkins-ci.org/browse/JENKINS-55422?page=com.atlassian.jira.plugin.system.issuetabpanels%3Aall-tabpanel

          Guru Vamsi Chintala added a comment - Added "-DSoftKillWaitSeconds=0 " in jenkins.xml before the -jar option. Now jobs execute normally with 2.150.2 version Reference: https://stackoverflow.com/questions/54039226/jenkins-hangs-between-build-and-post-build/54072987#54072987                   https://issues.jenkins-ci.org/browse/JENKINS-55422?page=com.atlassian.jira.plugin.system.issuetabpanels%3Aall-tabpanel

          Sean Kline added a comment -

          Thank you very much for pointing this out, Guru. We tried this and are now running 2.150.2 without the delay.

          Have a great day!

          Sean Kline added a comment - Thank you very much for pointing this out, Guru. We tried this and are now running 2.150.2 without the delay. Have a great day!

          kredens added a comment -

          Still doesn't work properly, had to roll back from 2.164.1 LTS to 2.138.4 LTS. Workaround mentioned above doesn't work for me either, builds are getting stuck.

          kredens added a comment - Still doesn't work properly, had to roll back from 2.164.1 LTS to 2.138.4 LTS. Workaround mentioned above doesn't work for me either, builds are getting stuck.

          Daniel Beck added a comment -

          Workaround mentioned above doesn't work for me either

          Are you sure you applied it correctly? Check the /systemInfo to see whether the system property is defined in Jenkins?

          Daniel Beck added a comment - Workaround mentioned above doesn't work for me either Are you sure you applied it correctly? Check the /systemInfo to see whether the system property is defined in Jenkins?

          kredens added a comment -

          danielbeck yes, it's applied properly and still various builds are randomly getting stuck. 

          kredens added a comment - danielbeck yes, it's applied properly and still various builds are randomly getting stuck. 

          Daniel Beck added a comment -

          kredens While you're waiting for the build to finish, check what Jenkins is doing: https://wiki.jenkins.io/display/JENKINS/Obtaining+a+thread+dump

          Daniel Beck added a comment - kredens While you're waiting for the build to finish, check what Jenkins is doing: https://wiki.jenkins.io/display/JENKINS/Obtaining+a+thread+dump

          Does this work for Jenkins slaves? This fixed our master instance, but it seems like a similar issue is present with jobs that run on slaves.

          Adding `-DSoftKillWaitSeconds=0` to jenkins-slave.xml and restarting the service adds it to the command line, but it doesn't seem to have any effect. Any ideas?

          Josh Schreuder added a comment - Does this work for Jenkins slaves? This fixed our master instance, but it seems like a similar issue is present with jobs that run on slaves. Adding `-DSoftKillWaitSeconds=0` to jenkins-slave.xml and restarting the service adds it to the command line, but it doesn't seem to have any effect. Any ideas?

          kredens added a comment -

          I should add I also run the jobs on slaves, not on the master node.

          kredens added a comment - I should add I also run the jobs on slaves, not on the master node.

          Rob Anderson added a comment -

          We also experience this but only on slave machines, and adding -DSoftKillWaitSeconds=0 has no affect on slave nodes.

          Rob Anderson added a comment - We also experience this but only on slave machines, and adding -DSoftKillWaitSeconds=0 has no affect on slave nodes.

          Ken Lamb added a comment - - edited

          We also experience this issue with jobs that run on slave machines. Adding -DSoftKillWaitSeconds did not affect the issue.

          Tried rolling back to Jenkins version 2.150.3, but the bug was still there.

          Then, rolled back to Jenkins version 2.138.4, and the bug is now gone.

          We will have to stay on 2.138.4 until this bug is resolved.

          Ken Lamb added a comment - - edited We also experience this issue with jobs that run on slave machines. Adding -DSoftKillWaitSeconds did not affect the issue. Tried rolling back to Jenkins version 2.150.3, but the bug was still there. Then, rolled back to Jenkins version 2.138.4, and the bug is now gone. We will have to stay on 2.138.4 until this bug is resolved.

          Daniel Beck added a comment -

          To clarify, are you setting the system property on agent processes? I.e. as additional launch arguments to java -jar agent.jar?

          Daniel Beck added a comment - To clarify, are you setting the system property on agent processes ? I.e. as additional launch arguments to java -jar agent.jar ?

          Ken Lamb added a comment -

          I only set it on the master launch process. From what I have read, it has no effect on slaves.

          Ken Lamb added a comment - I only set it on the master launch process. From what I have read, it has no effect on slaves.

          Daniel Beck added a comment -

          Right, Josh wrote that. Would still like explicit confirmation from someone affected that setting it doesn't work, including confirmation that it appears correctly on the URL /computer/name_here/systemInfo in the list of system properties, since it's easy to get the Java invocation wrong.

          Daniel Beck added a comment - Right, Josh wrote that. Would still like explicit confirmation from someone affected that setting it doesn't work, including confirmation that it appears correctly on the URL /computer/name_here/systemInfo in the list of system properties, since it's easy to get the Java invocation wrong.

          danielbeck

          Here's the slave command line:

          And from jenkins-slave.xml

          I'm pretty confident that this invocation is correct, as it's copied from our master agent where this parameter is working fine.

          Josh Schreuder added a comment - danielbeck Here's the slave command line: And from jenkins-slave.xml I'm pretty confident that this invocation is correct, as it's copied from our master agent where this parameter is working fine.

          Shawn Baker added a comment - - edited

          Any progress with this? I understand that there is a workaround, but shouldn't the commit that broke it be looked at to at least see why it's broken?

          Shawn Baker added a comment - - edited Any progress with this? I understand that there is a workaround, but shouldn't the commit that broke it be looked at to at least see why it's broken?

          Can you please fix this? 

          Philipp Mascha added a comment - Can you please fix this? 

          We've had this problem for 6 months or more, and have been searching high and low for a solution, without finding this issue.

          Just applied the workaround on one of our agents, and immediately cut down the build-time of one of our jobs by 25 minutes!!!!!!!!!! 

          I can't wait to see how much server time will be freed by this, but it looks like a LOT!

          Flemming Steffensen added a comment - We've had this problem for 6 months or more, and have been searching high and low for a solution, without finding this issue. Just applied the workaround on one of our agents, and immediately cut down the build-time of one of our jobs by 25 minutes!!!!!!!!!!  I can't wait to see how much server time will be freed by this, but it looks like a LOT!

          Andy Lin added a comment -

          This issue is preventing me from upgrading my Jenkins, and the plugin to Jenkins version gap is getting harder and harder to deal with.

          Is this issue going to be looked at? And has anyone had success with a workaround for a Jenkins instance that uses only slave machines?

          Andy Lin added a comment - This issue is preventing me from upgrading my Jenkins, and the plugin to Jenkins version gap is getting harder and harder to deal with. Is this issue going to be looked at? And has anyone had success with a workaround for a Jenkins instance that uses only slave machines?

          Ken Lamb added a comment -

          Just wanted to add a "me too" to Andy Lin's comment.
          I used to be very diligent about keeping my Jenkins and all the plugins up to date.
          This bug, however, has everything stuck with what works using Jenkins 2.138.4.

          Ken Lamb added a comment - Just wanted to add a "me too" to Andy Lin's comment. I used to be very diligent about keeping my Jenkins and all the plugins up to date. This bug, however, has everything stuck with what works using Jenkins 2.138.4.

          John Rocha added a comment -

          loafloaf, what scenario are you encountering this under? I had a similar problem when using MS Visual Studio on a slave.

          In my case the problem is that the slave waits for remote processes to close, and has a timeout of ~2 minutes per process. I found that I had parallel compiles enabled and 6 remote VS compile session on the slave. When it finished, those VS processes would not go away, and every 2 minutes jenknis would kill one of them.

          I learned that MS causes compile processes to stick around once they are started. The idea being that when a new compile is needed it can grab one of the idle processes. However, in my case Jenkins doesn't need/want any more compiles and is stuck waiting for the VS processes to go away.

          There is a flag that can be used at the command line that informs VS to not keep the processes alive. Details for this can be found in a similar issue I logged JENKINS-59400

          John Rocha added a comment - loafloaf , what scenario are you encountering this under? I had a similar problem when using MS Visual Studio on a slave. In my case the problem is that the slave waits for remote processes to close, and has a timeout of ~2 minutes per process. I found that I had parallel compiles enabled and 6 remote VS compile session on the slave. When it finished, those VS processes would not go away, and every 2 minutes jenknis would kill one of them. I learned that MS causes compile processes to stick around once they are started. The idea being that when a new compile is needed it can grab one of the idle processes. However, in my case Jenkins doesn't need/want any more compiles and is stuck waiting for the VS processes to go away. There is a flag that can be used at the command line that informs VS to not keep the processes alive. Details for this can be found in a similar issue I logged  JENKINS-59400

          Andy Lin added a comment -

          rocha_stratovan, I do use MS Visual Studio on some of my slave machines, but I don't think it does parallel compilation. I'll be sure to try out what you suggested. Do you experience the issue if you don't do parallel compiles?

          I have Mac slave machines for the other half. The solution you had might apply in some way so I'll have to investigate if xcodebuild also does something similar with lingering processes. Thanks!

          Andy Lin added a comment - rocha_stratovan , I do use MS Visual Studio on some of my slave machines, but I don't think it does parallel compilation. I'll be sure to try out what you suggested. Do you experience the issue if you don't do parallel compiles? I have Mac slave machines for the other half. The solution you had might apply in some way so I'll have to investigate if xcodebuild also does something similar with lingering processes. Thanks!

          John Rocha added a comment -

          loafloaf, I didn't seem to notice it when I did simple compiles without parallel compilation. Although I honestly would expect there to be at least a 2 minute delay even if there is just one compile process. But I don't know.

          Good luck.

          John Rocha added a comment - loafloaf , I didn't seem to notice it when I did simple compiles without parallel compilation. Although I honestly would expect there to be at least a 2 minute delay even if there is just one compile process. But I don't know. Good luck.

          kredens added a comment -

          Well, time passes, and some jobs still get stuck on the FINAL stage for about two minutes before finally letting go. How hard can it be to fix this?

          kredens added a comment - Well, time passes, and some jobs still get stuck on the FINAL stage for about two minutes before finally letting go. How hard can it be to fix this?

          Daniel Beck added a comment -

          kredens If it's so easy, submit a PR that does it.

          Daniel Beck added a comment - kredens If it's so easy, submit a PR that does it.

          kredens added a comment -

          danielbeck I'm not a developer on this project, neither am I using it by choice. Also - it used to work fine until someone changed something and can't be bothered to fix it.

          kredens added a comment - danielbeck I'm not a developer on this project, neither am I using it by choice. Also - it used to work fine until someone changed something and can't be bothered to fix it.

            Unassigned Unassigned
            ippo343 Michele Ippolito
            Votes:
            22 Vote for this issue
            Watchers:
            31 Start watching this issue

              Created:
              Updated: