Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-11936

Slave hang for a long time when a job is completes

    XMLWordPrintable

    Details

    • Similar Issues:

      Description

      I noticed that on several jobs running on different Windows slaves, it takes a long time for the job to end, as seen in the following:

      13:53:35 D:\Jenkins\workspace\CES7.0R_InstallKit>exit 0
      14:09:36 [WARNINGS] Parsing warnings in console log with parsers [MSBuild]
      14:09:36 [WARNINGS] MSBuild : Found 24 warnings.

      I saw another issue regarding the archiving of artifacts #7641, but in our case, this specific job does not have artifacts enabled.

      Also, I noticed this on different jobs, some are builds and some are tests.

      Anything I can do to help debug this?

      ---------------------------------

      I stopped a build that was hung and I got this call stack:

      10:48:26 f:\Jenkins\workspace\CES7.0Dx86_WIN>exit 9009
      10:48:27 Build step 'Execute Windows batch command' marked build as failure
      11:07:55 ERROR: Publisher hudson.plugins.warnings.WarningsPublisher aborted due to exception
      11:07:55 java.lang.InterruptedException
      11:07:55 at java.lang.Object.wait(Native Method)
      11:07:55 at java.lang.Object.wait(Object.java:485)
      11:07:55 at hudson.model.Run$Runner$CheckpointSet.waitForCheckPoint(Run.java:1295)
      11:07:55 at hudson.model.Run.waitForCheckpoint(Run.java:1263)
      11:07:55 at hudson.model.CheckPoint.block(CheckPoint.java:144)
      11:07:55 at hudson.tasks.BuildStepMonitor$2.perform(BuildStepMonitor.java:25)
      11:07:55 at hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:692)
      11:07:55 at hudson.model.AbstractBuild$AbstractRunner.performAllBuildSteps(AbstractBuild.java:667)
      11:07:55 at hudson.model.AbstractBuild$AbstractRunner.performAllBuildSteps(AbstractBuild.java:645)
      11:07:55 at hudson.model.Build$RunnerImpl.post2(Build.java:162)
      11:07:55 at hudson.model.AbstractBuild$AbstractRunner.post(AbstractBuild.java:614)
      11:07:55 at hudson.model.Run.run(Run.java:1429)
      11:07:55 at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
      11:07:55 at hudson.model.ResourceController.execute(ResourceController.java:88)
      11:07:55 at hudson.model.Executor.run(Executor.java:238)

      Looks like the culprit is the WarningsPublisher plugin. I'll disable it for now.

        Attachments

          Activity

          Hide
          danielbeck Daniel Beck added a comment -

          Alex: Read the original issue report: That indicated it's Warnings Plugin both in the build log and the stack trace. Tom Clift experienced a different problem (with Disk Usage Plugin of a version that didn't yet exist when this issue was filed!) that is actually tracked as JENKINS-23347.

          Show
          danielbeck Daniel Beck added a comment - Alex: Read the original issue report: That indicated it's Warnings Plugin both in the build log and the stack trace. Tom Clift experienced a different problem (with Disk Usage Plugin of a version that didn't yet exist when this issue was filed!) that is actually tracked as JENKINS-23347 .
          Hide
          rodriguesalex Alex Rodrigues added a comment - - edited

          I sorry Daniel but I do not understand your last statement.

          Why do you feel that the later versions of Disk Usage Plugin are not an issue in spite of seeing that uninstalling/or downgrading it (i.e. the Disk Usage Plugin) removes the hang/delay? Isn't the fact that I am seeing the same issue evernat saw a year ago enough confirmation that this issue still exists?

          Anyway, I had earlier reported that this issue occurs on one remote node but not the other. the remote node where this occurs has much less free space (i.e. <40%) compared to the other node (> 70%) where this works fine. It appears that the performance of plugin may be affected by the disk usage on the remote node and therefore one might see differing results based on the node the job is run on

          Show
          rodriguesalex Alex Rodrigues added a comment - - edited I sorry Daniel but I do not understand your last statement. Why do you feel that the later versions of Disk Usage Plugin are not an issue in spite of seeing that uninstalling/or downgrading it (i.e. the Disk Usage Plugin) removes the hang/delay? Isn't the fact that I am seeing the same issue evernat saw a year ago enough confirmation that this issue still exists? Anyway, I had earlier reported that this issue occurs on one remote node but not the other. the remote node where this occurs has much less free space (i.e. <40%) compared to the other node (> 70%) where this works fine. It appears that the performance of plugin may be affected by the disk usage on the remote node and therefore one might see differing results based on the node the job is run on
          Hide
          danielbeck Daniel Beck added a comment -

          What's mentioned in recent comments is completely unrelated to the reported issue: Disk Usage plugin started to get the workspace size at the end of a build in recent versions. There's an option to turn that off, but according to Andrew Bayer in his talk at JUC Berlin, it doesn't work.

          Resolving this issue as there's actually not been confirmation that this issue still exists as asked by evernat over a year ago.


          If you experience a problem that looks like this and are using a recent version of Disk Usage Plugin, it's a different issue. Disable or downgrade that plugin.

          Show
          danielbeck Daniel Beck added a comment - What's mentioned in recent comments is completely unrelated to the reported issue: Disk Usage plugin started to get the workspace size at the end of a build in recent versions. There's an option to turn that off, but according to Andrew Bayer in his talk at JUC Berlin, it doesn't work. Resolving this issue as there's actually not been confirmation that this issue still exists as asked by evernat over a year ago. If you experience a problem that looks like this and are using a recent version of Disk Usage Plugin, it's a different issue. Disable or downgrade that plugin.
          Hide
          rodriguesalex Alex Rodrigues added a comment - - edited

          I am using Jenkins v1.509.3 (with Disk Usage v.023) and see a similar issue. I have 2 remote RHEL Linux nodes and see this issue only when I try to run jobs on 1 node but not on the other. Uninstalling the disk-usage plugin fixed my issue, and I then uninstalled v.017 and the issue has not returned.

          Show
          rodriguesalex Alex Rodrigues added a comment - - edited I am using Jenkins v1.509.3 (with Disk Usage v.023) and see a similar issue. I have 2 remote RHEL Linux nodes and see this issue only when I try to run jobs on 1 node but not on the other. Uninstalling the disk-usage plugin fixed my issue, and I then uninstalled v.017 and the issue has not returned.
          Hide
          tomclift Tom Clift added a comment -

          Of course, sorry, we were using "Jenkins disk-usage plugin" version 0.23, released 2013-11-12. It's the latest release on the plugin's wiki page, but there is no changelog on that page for 0.22 or 0.23.

          Show
          tomclift Tom Clift added a comment - Of course, sorry, we were using "Jenkins disk-usage plugin" version 0.23, released 2013-11-12. It's the latest release on the plugin's wiki page , but there is no changelog on that page for 0.22 or 0.23.

            People

            Assignee:
            kdsweeney kdsweeney
            Reporter:
            marcsanfacon Marc Sanfacon
            Votes:
            2 Vote for this issue
            Watchers:
            6 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved: