Start Jenkins with java -jar jenkins-2.0-beta-1.jar on windows

      In the console hit ctrl+c

      expected behaviour

      Jenkins does a graceful shutdown and this is appropriately logged

      actual behaviour

      it appears as though Jenkins is brutally terminated and does not do a graceful shutdown.
      There are no logs to indicate a graceful termination

      In 1.x I see the following logs
      Mar 31, 2016 12:02:39 PM winstone.Logger logInternal INFO: JVM is terminating. Shutting down Winstone

      in 2.x I do not see those entries - and there is nothing to indicate in the log that Jenkins is shutting down gracefully

          [JENKINS-33926] Jenkins no longer appears to shutdown correctly

          James Nord added a comment - - edited

          whilst this sometimes occurs in 1.x testing (<10% of the time) during my 2.0 testing I get it > 90% of the time in 2.x implying something is worse somewhere.

          James Nord added a comment - - edited whilst this sometimes occurs in 1.x testing (<10% of the time) during my 2.0 testing I get it > 90% of the time in 2.x implying something is worse somewhere.

          Critical to investigate for RC. If this is only a problem for interactive runs, we could probably live with it for GA, but if this is affecting production instances running as services, that would be really nasty.

          Spike Washburn added a comment - Critical to investigate for RC. If this is only a problem for interactive runs, we could probably live with it for GA, but if this is affecting production instances running as services, that would be really nasty.

          Daniel Beck added a comment -

          Same issue for me on OS X so does not seem limited to Windows.

          If this means that e.g. service restarts don't do a clean restart, and e.g. lose the queue contents, that would be bad.

          If it's just a case of Jetty 9 no longer logging this, we'll survive

          Daniel Beck added a comment - Same issue for me on OS X so does not seem limited to Windows. If this means that e.g. service restarts don't do a clean restart, and e.g. lose the queue contents, that would be bad. If it's just a case of Jetty 9 no longer logging this, we'll survive

          While investigating, I didn't see any error show up consistently before missing the log statement. I didn't see this occur as often on 2.0 before clicking the "Install New Features". Luckily it doesn't seem to be disruptive to my Jenkins, so that's a plus. Trying to see if there's a problem after "Installing new features" or running it under heavy stress.

          Kristin Whetstone added a comment - While investigating, I didn't see any error show up consistently before missing the log statement. I didn't see this occur as often on 2.0 before clicking the "Install New Features". Luckily it doesn't seem to be disruptive to my Jenkins, so that's a plus. Trying to see if there's a problem after "Installing new features" or running it under heavy stress.

          Daniel Beck added a comment -

          Would be interesting to know whether the shutdown procedure (that includes saving the queue to disk) gets executed every time, log statement or not.

          Daniel Beck added a comment - Would be interesting to know whether the shutdown procedure (that includes saving the queue to disk) gets executed every time, log statement or not.

          Right, the investigation needs to prove/disprove if the shutdown sequence is consistently running as expected. Are there some clear signatures to this that Kristin could rely on? Examples: Some expected file update times change, a debug message that a Servlet.destroy got called, etc.

          Spike Washburn added a comment - Right, the investigation needs to prove/disprove if the shutdown sequence is consistently running as expected. Are there some clear signatures to this that Kristin could rely on? Examples: Some expected file update times change, a debug message that a Servlet.destroy got called, etc.

          Daniel Beck added a comment -

          swashbuck1r queue.xml gets written (possibly only when there are items in the queue, not sure).

          Daniel Beck added a comment - swashbuck1r queue.xml gets written (possibly only when there are items in the queue, not sure).

          Kristin Whetstone added a comment - - edited

          I think this (JENKINS-30909) is similar since it's referring to a missing queue log.

          In testing where I physically go through the whole computer shutdown process, it appears that things run properly. (I find a properly timestamped queue.xml.bak file) When I just stop the process from the command line, I don't see any queue.xml file. I assume this case falls under this defect.

          So far, the behavior I've seen echos this work item. I'll have to look to a different mechanism (James Nord brought up TermMilestone) to see if there's another issue.

          Kristin Whetstone added a comment - - edited I think this ( JENKINS-30909 ) is similar since it's referring to a missing queue log. In testing where I physically go through the whole computer shutdown process, it appears that things run properly. (I find a properly timestamped queue.xml.bak file) When I just stop the process from the command line, I don't see any queue.xml file. I assume this case falls under this defect. So far, the behavior I've seen echos this work item. I'll have to look to a different mechanism (James Nord brought up TermMilestone) to see if there's another issue.

          Kristin Whetstone added a comment - - edited

          After installing Jenkins 2.0 as a Windows service and running a heavy mock load, I'm still seeing the queue.xml file appear. Since the service is able to start up earlier than I can access my file system, I know this file is created because of queue.xml.bak which isn't removed & contains the same information.

          The logs left in jenkins.out.log and jenkins.err.log don't have any messages for shutting down, but I have a feeling they're overwritten on service restart, so they're likely not useful. Checking the log on the Jenkins instance also only covers the current run.

          I linked to JENKINS-30909 to highlight the behavior I'm seeing as part of restarting/destroying-recreating the service. Nothing really appears to be out of the ordinary.

          From what I can tell during these tests, I don't think this is a blocker issue for 2.0 release.

          Kristin Whetstone added a comment - - edited After installing Jenkins 2.0 as a Windows service and running a heavy mock load, I'm still seeing the queue.xml file appear. Since the service is able to start up earlier than I can access my file system, I know this file is created because of queue.xml.bak which isn't removed & contains the same information. The logs left in jenkins.out.log and jenkins.err.log don't have any messages for shutting down, but I have a feeling they're overwritten on service restart, so they're likely not useful. Checking the log on the Jenkins instance also only covers the current run. I linked to JENKINS-30909 to highlight the behavior I'm seeing as part of restarting/destroying-recreating the service. Nothing really appears to be out of the ordinary. From what I can tell during these tests, I don't think this is a blocker issue for 2.0 release.

          Daniel Beck added a comment -

          When I just stop the process from the command line, I don't see any queue.xml file. I assume this case falls under this defect.

          Would be interesting to know whether a Linux service behaves differently in shutting down. Not sure which signals each sends. Loss of the queue on every shutdown would be a regression.

          Daniel Beck added a comment - When I just stop the process from the command line, I don't see any queue.xml file. I assume this case falls under this defect. Would be interesting to know whether a Linux service behaves differently in shutting down. Not sure which signals each sends. Loss of the queue on every shutdown would be a regression.

          James Nord added a comment - - edited

          kwhetstoneno need to restart the OS, just start stop the service either via the SCM or sc stop jenkins so you always have access to the filesystem

          James Nord added a comment - - edited kwhetstone no need to restart the OS, just start stop the service either via the SCM or sc stop jenkins so you always have access to the filesystem

          So I ran through a number of scenarios on Windows and Linux on 2.0 and 1.X and it appears that for Windows we never actually save the queue when we stop the service. Linux is fine; I started and stopped the service while every executor is used, and everything is saved and restarted. I think this might be a difference in the command signals sent to the service when stopping in Windows vs Linux, but that's just a theory at this point.

          As to other shutdown actions taken when the system is shutting down, since I'm not able to replicate the initial condition, all of my exit conditions (TermMilestones, etc) are hit. Other than this queue conditional, I don't see anything odd with the service.

          Kristin Whetstone added a comment - So I ran through a number of scenarios on Windows and Linux on 2.0 and 1.X and it appears that for Windows we never actually save the queue when we stop the service. Linux is fine; I started and stopped the service while every executor is used, and everything is saved and restarted. I think this might be a difference in the command signals sent to the service when stopping in Windows vs Linux, but that's just a theory at this point. As to other shutdown actions taken when the system is shutting down, since I'm not able to replicate the initial condition, all of my exit conditions (TermMilestones, etc) are hit. Other than this queue conditional, I don't see anything odd with the service.

          Summary of test results from Kristin:

          • Does the message get printed on shutdown? (Windows: 100% appears consistently, Linux: 100% appears consistently)
          • Does the shutdown sequence seem to complete?
          • Marker:queue.xml gets saved (yes on Linux, yes on Windows cmdline, no on Windows service). No on windows service seems consistent with 1.x line of Jenkins.
          • Marker:plugin shutdown() is being called (yes on Linux and Windows)

          Decision [Daniel, Kristin]: At this time, there's no indication of a regression on beta2 vs 1.x, so this is not a stop-ship

          Spike Washburn added a comment - Summary of test results from Kristin: Does the message get printed on shutdown? (Windows: 100% appears consistently, Linux: 100% appears consistently) Does the shutdown sequence seem to complete? Marker:queue.xml gets saved (yes on Linux, yes on Windows cmdline, no on Windows service). No on windows service seems consistent with 1.x line of Jenkins. Marker:plugin shutdown() is being called (yes on Linux and Windows) Decision [Daniel, Kristin] : At this time, there's no indication of a regression on beta2 vs 1.x, so this is not a stop-ship

          Next steps:

          • Daniel will add testcase for this for the RC and future LTSs.
          • At this time, this issues appears to be not reproducible...Daniel will close.
          • Kristin will open another ticket about the problem where queue.xml doesn't get saved via Windows Service shutdown.

          Spike Washburn added a comment - Next steps: Daniel will add testcase for this for the RC and future LTSs. At this time, this issues appears to be not reproducible...Daniel will close. Kristin will open another ticket about the problem where queue.xml doesn't get saved via Windows Service shutdown.

          Daniel Beck added a comment -

          Resolving as Cannot Reproduce for now. I'll add test cases to RC/LTS testing plans so we make sure to check this for the 2.0 RC.

          Daniel Beck added a comment - Resolving as Cannot Reproduce for now. I'll add test cases to RC/LTS testing plans so we make sure to check this for the 2.0 RC.

          James Nord added a comment -

          While investigating, I didn't see any error show up consistently before missing the log statement. I didn't see this occur as often on 2.0 before clicking the "Install New Features". Luckily it doesn't seem to be disruptive to my Jenkins, so that's a plus.

          Seems to imply it is reproduceable from the command line?

          Anyway I can reproduce it so you can always assign it to me

          James Nord added a comment - While investigating, I didn't see any error show up consistently before missing the log statement. I didn't see this occur as often on 2.0 before clicking the "Install New Features". Luckily it doesn't seem to be disruptive to my Jenkins, so that's a plus. Seems to imply it is reproduceable from the command line? Anyway I can reproduce it so you can always assign it to me

          Sam Van Oort added a comment -

          100% reproducible with the 2.0-rc WAR running on Mac.

          Test case:

          • Create a pipeline job

          node {
          echo 'stuffs'
          sleep 100
          stage 'Stage 2'
          }

          • Create a freestyle job, set to allow concurrent builds, with the following shell step:
            echo 'I do stuff'
            sleep 100
          • Start up several builds of the pipeline job, to consume executors.
          • Queue up the freestyle project
          • Shutdown jenkins via Ctrl+C

          Result:

          • Command line reports winstone/jetty shutting down
          • queue.xml & queue.xml.bak files are created but contain no information about the queued build or builds. Pipelines that were running will restart however.
          • if Jenkins is started again, build will not run from queue. Indeed it is as if the freestyle project was never queued: new builds will use its build #, no record it ever existed.

          Sam Van Oort added a comment - 100% reproducible with the 2.0-rc WAR running on Mac. Test case: Create a pipeline job node { echo 'stuffs' sleep 100 stage 'Stage 2' } Create a freestyle job, set to allow concurrent builds, with the following shell step: echo 'I do stuff' sleep 100 Start up several builds of the pipeline job, to consume executors. Queue up the freestyle project Shutdown jenkins via Ctrl+C Result: Command line reports winstone/jetty shutting down queue.xml & queue.xml.bak files are created but contain no information about the queued build or builds. Pipelines that were running will restart however. if Jenkins is started again, build will not run from queue. Indeed it is as if the freestyle project was never queued: new builds will use its build #, no record it ever existed.

          Sam Van Oort added a comment -

          Reopening because it exists in a painful and reproducible way (see previous comments); the entire build queue is lost on shutdown.

          Sam Van Oort added a comment - Reopening because it exists in a painful and reproducible way (see previous comments); the entire build queue is lost on shutdown.

          James, I'm sending it over to you since I couldn't reproduce it on beta while Ctrl-C or while stopping the service. I'm going to see if I can reproduce it on the RC, but I must not be going through the correct steps or something.

          Note: the part about losing the job queue when restarted as a Windows service looks like an older problem. I definitely think that this should be a bug if it's not already.

          Kristin Whetstone added a comment - James, I'm sending it over to you since I couldn't reproduce it on beta while Ctrl-C or while stopping the service. I'm going to see if I can reproduce it on the RC, but I must not be going through the correct steps or something. Note: the part about losing the job queue when restarted as a Windows service looks like an older problem. I definitely think that this should be a bug if it's not already.

          Sam, it sounds like your defect is something different than what the original problem covered by this work item. While the queue is saved, it's not saving everything. That might be a bug in the queue.save(). In the Windows version it's not getting saved at all during Ctrl-C. We should open a ticket for that problem so it could be fixed.

          Kristin Whetstone added a comment - Sam, it sounds like your defect is something different than what the original problem covered by this work item. While the queue is saved, it's not saving everything. That might be a bug in the queue.save(). In the Windows version it's not getting saved at all during Ctrl-C. We should open a ticket for that problem so it could be fixed.

          Sam Van Oort added a comment -

          Yeah, I can't tell for sure if it's the same issue, something similar, or these are all different manifestations of the same underlying cause.

          The previous issue with losing the build queue apparently had some sort of handling with nextBuildNumber that resolved it (I think?). This one seems to be some combination of shutdown logging & queue writing, which could also be two separate issues as well.

          Sam Van Oort added a comment - Yeah, I can't tell for sure if it's the same issue, something similar, or these are all different manifestations of the same underlying cause. The previous issue with losing the build queue apparently had some sort of handling with nextBuildNumber that resolved it (I think?). This one seems to be some combination of shutdown logging & queue writing, which could also be two separate issues as well.

          To me, the file showing up in the first place and that it's successfully read on startup means the file was created correctly (there's no missing information when being written and it's not formatted incorrectly) it might just be incomplete. Obviously I'm making some assumptions on the queue behavior, and spent some time trying to go through where the queue was written so figure out how it came into being. Apparently there are 2 types of shutdown: one that writes the queue and one that's "quick" not counting the error case, the secret not-so-fun shutdown. From what I understand the error case just completely tanks jetty, et al. If it's able to save state enough to pick back up on a restart, I think that's slightly better than not having anything at all.

          I'd check out if the queue being incomplete issue exists on the latest 1.x release. That will at least give some context into whether it's something new or not. Either way it's not good, but at least it's not worse!

          Kristin Whetstone added a comment - To me, the file showing up in the first place and that it's successfully read on startup means the file was created correctly (there's no missing information when being written and it's not formatted incorrectly) it might just be incomplete. Obviously I'm making some assumptions on the queue behavior, and spent some time trying to go through where the queue was written so figure out how it came into being. Apparently there are 2 types of shutdown: one that writes the queue and one that's "quick" not counting the error case, the secret not-so-fun shutdown. From what I understand the error case just completely tanks jetty, et al. If it's able to save state enough to pick back up on a restart, I think that's slightly better than not having anything at all. I'd check out if the queue being incomplete issue exists on the latest 1.x release. That will at least give some context into whether it's something new or not. Either way it's not good, but at least it's not worse!

          Sam Van Oort added a comment -

          Oh! I'd tested on CJE 1.625.16.1 - the queue is maintained when killed by Ctrl+C. Just rechecked, and that is also true on 1.642.x - so it's clearly a regression since then.

          To be clear: we're not saving state to pick back up on restart at all. The queue.xml is empty of items, this is one sample:

          <?xml version='1.0' encoding='UTF-8'?>
          <hudson.model.Queue_-State>
            <counter>42</counter>
            <items/>
          </hudson.model.Queue_-State>
          

          Sam Van Oort added a comment - Oh! I'd tested on CJE 1.625.16.1 - the queue is maintained when killed by Ctrl+C. Just rechecked, and that is also true on 1.642.x - so it's clearly a regression since then. To be clear: we're not saving state to pick back up on restart at all. The queue.xml is empty of items, this is one sample: <?xml version= '1.0' encoding= 'UTF-8' ?> <hudson.model.Queue_-State> <counter>42</counter> <items/> </hudson.model.Queue_-State>

          Daniel Beck added a comment -

          Hmmm… could this be related to JENKINS-34029?

          Daniel Beck added a comment - Hmmm… could this be related to JENKINS-34029 ?

          Cool, so that's actually a regression from beta2 where that information was saved. I think that's a completely separate defect since this jetty issue happened before then. Good find during TestFest!

          Kristin Whetstone added a comment - Cool, so that's actually a regression from beta2 where that information was saved. I think that's a completely separate defect since this jetty issue happened before then. Good find during TestFest!

          James Nord added a comment -

          Not saving the queue on windows was fixed recently on master.
          I'm unassigning from myself for the moment as I'm not looking at this. If I get to look at it and it's not taken I will re-assign it.

          James Nord added a comment - Not saving the queue on windows was fixed recently on master. I'm unassigning from myself for the moment as I'm not looking at this. If I get to look at it and it's not taken I will re-assign it.

          Sam Van Oort added a comment -

          Closing this since Kristin opened https://issues.jenkins-ci.org/browse/JENKINS-34281 which looks to be a completely separate issue.

          Sam Van Oort added a comment - Closing this since Kristin opened https://issues.jenkins-ci.org/browse/JENKINS-34281 which looks to be a completely separate issue.

          Sam Van Oort added a comment -

          Closing this, since Kristin has forked this out.

          Sam Van Oort added a comment - Closing this, since Kristin has forked this out.

            Unassigned Unassigned
            teilo James Nord
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated:
              Resolved: