In our instances pipeline jobs stuck very often after restart. It seems like pipeline detects that it needs to continue and tries to start execution where was interrupted but nothings is executed, it only looks like it is executed from UI - computer is occupied by job, build is in running state, but it seems that it is in endless waiting cycle.

      The interesting is that this state is probably saved persistently because I do not catch it every time, but when it is caught, I can restart jenkins as many times as I want and it will never  recover successfully (always stuck).

      I believe that problem is somewhere in program.dat, but it is not easily readable as xml for human so I am not sure where the difference is. 

      I did the same with the previous 2 runs and they were able to recover, but the third one did not.

      A add the screen of the successful recovers (or unsuccessful - one ended by failure but it did not stuck, which is success in this case) and the screen of build with issue. Since it seems to be persistent (as I described) I archived jenkins home and add it as well and the jenkins war as well. Archived jenkins home contains the versions of pipeline as well.

       

      Thank you for help!

       

          [JENKINS-50892] Pipeline jobs stuck after restart

          Lucie Votypkova created issue -
          Lucie Votypkova made changes -
          Attachment New: stuck-screen.png [ 42223 ]
          Lucie Votypkova made changes -
          Attachment New: Screenshot from 2018-04-19 11-16-19.png [ 42224 ]
          Lucie Votypkova made changes -
          Attachment New: Screenshot from 2018-04-19 11-19-22.png [ 42225 ]
          Lucie Votypkova made changes -
          Priority Original: Minor [ 4 ] New: Major [ 3 ]

          I am sorry I can not upload the war and the home because of size limit. I added only "jobs home", I will sent you the rest by e-mail if you are interested.

          Lucie Votypkova added a comment - I am sorry I can not upload the war and the home because of size limit. I added only "jobs home", I will sent you the rest by e-mail if you are interested.
          Lucie Votypkova made changes -
          Attachment New: jobs.zip [ 42226 ]
          Sam Van Oort made changes -
          Assignee New: Sam Van Oort [ svanoort ]

          Jesse Glick added a comment -

          Maybe related to JENKINS-50199 svanoort? Just a casual guess, have not looked at the details.

          Jesse Glick added a comment - Maybe related to JENKINS-50199 svanoort ? Just a casual guess, have not looked at the details.

          Sam Van Oort added a comment -

          lvotypkova Have you tried adding this startup setting on your Jenkins master? "-Dorg.jenkinsci.plugins.durabletask.BourneShellScript.HEARTBEAT_CHECK_INTERVAL=300"

          Off the top of my head, this one looks like it's an issue with the Shell step and how it detects whether the shell step is running or dead, rather than a resume issue. Adding a long-enough timeout with that setting may resolve this.

          Sam Van Oort added a comment - lvotypkova Have you tried adding this startup setting on your Jenkins master? "-Dorg.jenkinsci.plugins.durabletask.BourneShellScript.HEARTBEAT_CHECK_INTERVAL=300" Off the top of my head, this one looks like it's an issue with the Shell step and how it detects whether the shell step is running or dead, rather than a resume issue. Adding a long-enough timeout with that setting may resolve this.

            jglick Jesse Glick
            lvotypkova Lucie Votypkova
            Votes:
            0 Vote for this issue
            Watchers:
            10 Start watching this issue

              Created:
              Updated:
              Resolved: