• Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Major Major
    • core, remoting
    • Jenkins 1.558 / Centos5 slaves

      I observe that jobs starting at the same time the slave is performing some sort of cleanup action hang:

      Build Log:

      17:04:21 Started by upstream project "pipeline-test-2" build number 422
      17:04:21 originally caused by:
      17:04:21  Started by upstream project "master-test" build number 867
      17:04:21  originally caused by:
      17:04:21   Started by upstream project "master-build" build number 5131
      17:04:21   originally caused by:
      17:04:21    Started by an SCM change
      17:04:21 [EnvInject] - Loading node environment variables.
      ... hang for 14h ...
      

      Slave Log:

      ... lots of output ...
      INFO: Deleting /tmp/1408400946520-0/jboss51x_centos5-x64-bld-slave-09.corp.XXX.com_10002/lib/jboss-j2se.jar (atime=1408400947, diff=-5424)
      Aug 25, 2014 5:01:12 PM hudson.plugins.tmpcleaner.TmpCleanTask visit
      INFO: Deleting /tmp/1408400946520-0/jboss51x_centos5-x64-bld-slave-09.corp.XXX.com_10002/lib/javassist.jar (atime=1408400947, diff=-5424)
      Aug 25, 2014 5:01:12 PM hudson.plugins.tmpcleaner.TmpCleanTask visit
      INFO: Deleting /tmp/1408400946520-0/jboss51x_centos5-x64-bld-slave-09.corp.XXX.com_10002/lib/jboss-system-jmx.jar (atime=1408400947, diff=-5424)
      Aug 25, 2014 5:01:12 PM hudson.plugins.tmpcleaner.TmpCleanTask visit
      INFO: Deleting /tmp/1408400946520-0/jboss51x_centos5-x64-bld-slave-09.corp.XXX.com_10002/lib/jboss-system.jar (atime=1408400947, diff=-5424)
      Aug 25, 2014 5:01:12 PM hudson.plugins.tmpcleaner.TmpCleanTask visit
      INFO: Deleting /tmp/1408400946520-0/jboss51x_centos5-x64-bld-slave-09.corp.XXX.com_10002/lib/jboss-mdr.jar (atime=1408400947, diff=-5424)
      Aug 25, 2014 5:01:12 PM hudson.plugins.tmpcleaner.TmpCleanTask visit
      INFO: Deleting /tmp/1408400946520-0/jboss51x_centos5-x64-bld-slave-09.corp.XXX.com_10002/lib/jboss-logging-spi.jar (atime=1408400947, diff=-5424)
      Aug 25, 2014 5:01:12 PM hudson.plugins.tmpcleaner.TmpCleanTask visit
      

      After interrupting the job, I get this:

      0:30:26 ERROR: SEVERE ERROR occurs
      10:30:26 org.jenkinsci.lib.envinject.EnvInjectException: java.lang.InterruptedException
      10:30:26 	at org.jenkinsci.plugins.envinject.service.EnvironmentVariablesNodeLoader.gatherEnvironmentVariablesNode(EnvironmentVariablesNodeLoader.java:77)
      10:30:26 	at org.jenkinsci.plugins.envinject.EnvInjectListener.loadEnvironmentVariablesNode(EnvInjectListener.java:81)
      10:30:26 	at org.jenkinsci.plugins.envinject.EnvInjectListener.setUpEnvironment(EnvInjectListener.java:39)
      10:30:26 	at hudson.model.AbstractBuild$AbstractBuildExecution.createLauncher(AbstractBuild.java:575)
      10:30:26 	at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:481)
      10:30:26 	at hudson.model.Run.execute(Run.java:1689)
      10:30:26 	at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
      10:30:26 	at hudson.model.ResourceController.execute(ResourceController.java:88)
      10:30:26 	at hudson.model.Executor.run(Executor.java:231)
      10:30:26 Caused by: java.lang.InterruptedException
      10:30:26 	at java.lang.Object.wait(Native Method)
      10:30:26 	at hudson.remoting.Request.call(Request.java:146)
      10:30:26 	at hudson.remoting.Channel.call(Channel.java:722)
      10:30:26 	at hudson.FilePath.act(FilePath.java:1003)
      10:30:26 	at org.jenkinsci.plugins.envinject.service.EnvironmentVariablesNodeLoader.gatherEnvironmentVariablesNode(EnvironmentVariablesNodeLoader.java:44)
      10:30:26 	... 8 more
      10:30:26 Archiving artifacts
      10:30:26 ERROR: Publisher hudson.tasks.Mailer aborted due to exception
      10:30:26 
      hudson.remoting.ChannelClosedException: channel is already closed
      10:30:26 	at hudson.remoting.Channel.send(Channel.java:524)
      10:30:26 	at hudson.remoting.Request.call(Request.java:129)
      10:30:26 	at hudson.remoting.Channel.call(Channel.java:722)
      10:30:26 	at hudson.EnvVars.getRemote(EnvVars.java:404)
      10:30:26 	at hudson.model.Computer.getEnvironment(Computer.java:911)
      10:30:26 	at jenkins.model.CoreEnvironmentContributor.buildEnvironmentFor(CoreEnvironmentContributor.java:29)
      10:30:26 	at hudson.model.Run.getEnvironment(Run.java:2202)
      10:30:26 	at hudson.model.AbstractBuild.getEnvironment(AbstractBuild.java:873)
      10:30:26 	at hudson.tasks.Mailer.perform(Mailer.java:134)
      10:30:26 	at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
      10:30:26 	at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:744)
      10:30:26 	at hudson.model.AbstractBuild$AbstractBuildExecution.performAllBuildSteps(AbstractBuild.java:714)
      10:30:26 	at hudson.model.Build$BuildExecution.post2(Build.java:182)
      10:30:26 	at hudson.model.AbstractBuild$AbstractBuildExecution.post(AbstractBuild.java:663)
      10:30:26 	at hudson.model.Run.execute(Run.java:1714)
      10:30:26 	at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
      10:30:26 	at hudson.model.ResourceController.execute(ResourceController.java:88)
      10:30:26 	at hudson.model.Executor.run(Executor.java:231)
      10:30:26 Caused by: java.io.IOException
      10:30:26 	at hudson.remoting.Channel.close(Channel.java:1007)
      10:30:26 	at hudson.slaves.ChannelPinger$1.onDead(ChannelPinger.java:110)
      10:30:26 	at hudson.remoting.PingThread.ping(PingThread.java:120)
      10:30:26 	at hudson.remoting.PingThread.run(PingThread.java:81)
      10:30:26 Caused by: java.util.concurrent.TimeoutException: Ping started on 1409011309670 hasn't completed at 1409011549671
      10:30:26 	... 2 more
      10:30:26 [BFA] Scanning build for known causes...
      10:30:26 
      10:30:26 [BFA] Done. 0s
      10:30:26 [EnvInject] - [ERROR] - SEVERE ERROR occurs: channel is already closed
      

          [JENKINS-24449] slave cleanup causes jobs to hang

          Christian Goetze created issue -
          Christian Goetze made changes -
          Description Original: I observe that jobs starting at the same time the slave is performing some sort of cleanup action hang:

          Build Log:
          {noformat}
          17:04:21 Started by upstream project "pipeline-test-2" build number 422
          17:04:21 originally caused by:
          17:04:21 Started by upstream project "master-test" build number 867
          17:04:21 originally caused by:
          17:04:21 Started by upstream project "master-build" build number 5131
          17:04:21 originally caused by:
          17:04:21 Started by an SCM change
          17:04:21 [EnvInject] - Loading node environment variables.
          ... hang for 14h ...
          {noformat}

          Slave Log:
          {noformat}
          ... lots of output ...
          INFO: Deleting /tmp/1408400946520-0/jboss51x_centos5-x64-bld-slave-09.corp.appdynamics.com_10002/lib/jboss-j2se.jar (atime=1408400947, diff=-5424)
          Aug 25, 2014 5:01:12 PM hudson.plugins.tmpcleaner.TmpCleanTask visit
          INFO: Deleting /tmp/1408400946520-0/jboss51x_centos5-x64-bld-slave-09.corp.appdynamics.com_10002/lib/javassist.jar (atime=1408400947, diff=-5424)
          Aug 25, 2014 5:01:12 PM hudson.plugins.tmpcleaner.TmpCleanTask visit
          INFO: Deleting /tmp/1408400946520-0/jboss51x_centos5-x64-bld-slave-09.corp.appdynamics.com_10002/lib/jboss-system-jmx.jar (atime=1408400947, diff=-5424)
          Aug 25, 2014 5:01:12 PM hudson.plugins.tmpcleaner.TmpCleanTask visit
          INFO: Deleting /tmp/1408400946520-0/jboss51x_centos5-x64-bld-slave-09.corp.appdynamics.com_10002/lib/jboss-system.jar (atime=1408400947, diff=-5424)
          Aug 25, 2014 5:01:12 PM hudson.plugins.tmpcleaner.TmpCleanTask visit
          INFO: Deleting /tmp/1408400946520-0/jboss51x_centos5-x64-bld-slave-09.corp.appdynamics.com_10002/lib/jboss-mdr.jar (atime=1408400947, diff=-5424)
          Aug 25, 2014 5:01:12 PM hudson.plugins.tmpcleaner.TmpCleanTask visit
          INFO: Deleting /tmp/1408400946520-0/jboss51x_centos5-x64-bld-slave-09.corp.appdynamics.com_10002/lib/jboss-logging-spi.jar (atime=1408400947, diff=-5424)
          Aug 25, 2014 5:01:12 PM hudson.plugins.tmpcleaner.TmpCleanTask visit
          {noformat}

          After interrupting the job, I get this:
          {noformat}
          0:30:26 ERROR: SEVERE ERROR occurs
          10:30:26 org.jenkinsci.lib.envinject.EnvInjectException: java.lang.InterruptedException
          10:30:26 at org.jenkinsci.plugins.envinject.service.EnvironmentVariablesNodeLoader.gatherEnvironmentVariablesNode(EnvironmentVariablesNodeLoader.java:77)
          10:30:26 at org.jenkinsci.plugins.envinject.EnvInjectListener.loadEnvironmentVariablesNode(EnvInjectListener.java:81)
          10:30:26 at org.jenkinsci.plugins.envinject.EnvInjectListener.setUpEnvironment(EnvInjectListener.java:39)
          10:30:26 at hudson.model.AbstractBuild$AbstractBuildExecution.createLauncher(AbstractBuild.java:575)
          10:30:26 at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:481)
          10:30:26 at hudson.model.Run.execute(Run.java:1689)
          10:30:26 at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
          10:30:26 at hudson.model.ResourceController.execute(ResourceController.java:88)
          10:30:26 at hudson.model.Executor.run(Executor.java:231)
          10:30:26 Caused by: java.lang.InterruptedException
          10:30:26 at java.lang.Object.wait(Native Method)
          10:30:26 at hudson.remoting.Request.call(Request.java:146)
          10:30:26 at hudson.remoting.Channel.call(Channel.java:722)
          10:30:26 at hudson.FilePath.act(FilePath.java:1003)
          10:30:26 at org.jenkinsci.plugins.envinject.service.EnvironmentVariablesNodeLoader.gatherEnvironmentVariablesNode(EnvironmentVariablesNodeLoader.java:44)
          10:30:26 ... 8 more
          10:30:26 Archiving artifacts
          10:30:26 ERROR: Publisher hudson.tasks.Mailer aborted due to exception
          10:30:26
          hudson.remoting.ChannelClosedException: channel is already closed
          10:30:26 at hudson.remoting.Channel.send(Channel.java:524)
          10:30:26 at hudson.remoting.Request.call(Request.java:129)
          10:30:26 at hudson.remoting.Channel.call(Channel.java:722)
          10:30:26 at hudson.EnvVars.getRemote(EnvVars.java:404)
          10:30:26 at hudson.model.Computer.getEnvironment(Computer.java:911)
          10:30:26 at jenkins.model.CoreEnvironmentContributor.buildEnvironmentFor(CoreEnvironmentContributor.java:29)
          10:30:26 at hudson.model.Run.getEnvironment(Run.java:2202)
          10:30:26 at hudson.model.AbstractBuild.getEnvironment(AbstractBuild.java:873)
          10:30:26 at hudson.tasks.Mailer.perform(Mailer.java:134)
          10:30:26 at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
          10:30:26 at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:744)
          10:30:26 at hudson.model.AbstractBuild$AbstractBuildExecution.performAllBuildSteps(AbstractBuild.java:714)
          10:30:26 at hudson.model.Build$BuildExecution.post2(Build.java:182)
          10:30:26 at hudson.model.AbstractBuild$AbstractBuildExecution.post(AbstractBuild.java:663)
          10:30:26 at hudson.model.Run.execute(Run.java:1714)
          10:30:26 at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
          10:30:26 at hudson.model.ResourceController.execute(ResourceController.java:88)
          10:30:26 at hudson.model.Executor.run(Executor.java:231)
          10:30:26 Caused by: java.io.IOException
          10:30:26 at hudson.remoting.Channel.close(Channel.java:1007)
          10:30:26 at hudson.slaves.ChannelPinger$1.onDead(ChannelPinger.java:110)
          10:30:26 at hudson.remoting.PingThread.ping(PingThread.java:120)
          10:30:26 at hudson.remoting.PingThread.run(PingThread.java:81)
          10:30:26 Caused by: java.util.concurrent.TimeoutException: Ping started on 1409011309670 hasn't completed at 1409011549671
          10:30:26 ... 2 more
          10:30:26 [BFA] Scanning build for known causes...
          10:30:26
          10:30:26 [BFA] Done. 0s
          10:30:26 [EnvInject] - [ERROR] - SEVERE ERROR occurs: channel is already closed
          {noformat}
          New: I observe that jobs starting at the same time the slave is performing some sort of cleanup action hang:

          Build Log:
          {noformat}
          17:04:21 Started by upstream project "pipeline-test-2" build number 422
          17:04:21 originally caused by:
          17:04:21 Started by upstream project "master-test" build number 867
          17:04:21 originally caused by:
          17:04:21 Started by upstream project "master-build" build number 5131
          17:04:21 originally caused by:
          17:04:21 Started by an SCM change
          17:04:21 [EnvInject] - Loading node environment variables.
          ... hang for 14h ...
          {noformat}

          Slave Log:
          {noformat}
          ... lots of output ...
          INFO: Deleting /tmp/1408400946520-0/jboss51x_centos5-x64-bld-slave-09.corp.XXX.com_10002/lib/jboss-j2se.jar (atime=1408400947, diff=-5424)
          Aug 25, 2014 5:01:12 PM hudson.plugins.tmpcleaner.TmpCleanTask visit
          INFO: Deleting /tmp/1408400946520-0/jboss51x_centos5-x64-bld-slave-09.corp.XXX.com_10002/lib/javassist.jar (atime=1408400947, diff=-5424)
          Aug 25, 2014 5:01:12 PM hudson.plugins.tmpcleaner.TmpCleanTask visit
          INFO: Deleting /tmp/1408400946520-0/jboss51x_centos5-x64-bld-slave-09.corp.XXX.com_10002/lib/jboss-system-jmx.jar (atime=1408400947, diff=-5424)
          Aug 25, 2014 5:01:12 PM hudson.plugins.tmpcleaner.TmpCleanTask visit
          INFO: Deleting /tmp/1408400946520-0/jboss51x_centos5-x64-bld-slave-09.corp.XXX.com_10002/lib/jboss-system.jar (atime=1408400947, diff=-5424)
          Aug 25, 2014 5:01:12 PM hudson.plugins.tmpcleaner.TmpCleanTask visit
          INFO: Deleting /tmp/1408400946520-0/jboss51x_centos5-x64-bld-slave-09.corp.XXX.com_10002/lib/jboss-mdr.jar (atime=1408400947, diff=-5424)
          Aug 25, 2014 5:01:12 PM hudson.plugins.tmpcleaner.TmpCleanTask visit
          INFO: Deleting /tmp/1408400946520-0/jboss51x_centos5-x64-bld-slave-09.corp.XXX.com_10002/lib/jboss-logging-spi.jar (atime=1408400947, diff=-5424)
          Aug 25, 2014 5:01:12 PM hudson.plugins.tmpcleaner.TmpCleanTask visit
          {noformat}

          After interrupting the job, I get this:
          {noformat}
          0:30:26 ERROR: SEVERE ERROR occurs
          10:30:26 org.jenkinsci.lib.envinject.EnvInjectException: java.lang.InterruptedException
          10:30:26 at org.jenkinsci.plugins.envinject.service.EnvironmentVariablesNodeLoader.gatherEnvironmentVariablesNode(EnvironmentVariablesNodeLoader.java:77)
          10:30:26 at org.jenkinsci.plugins.envinject.EnvInjectListener.loadEnvironmentVariablesNode(EnvInjectListener.java:81)
          10:30:26 at org.jenkinsci.plugins.envinject.EnvInjectListener.setUpEnvironment(EnvInjectListener.java:39)
          10:30:26 at hudson.model.AbstractBuild$AbstractBuildExecution.createLauncher(AbstractBuild.java:575)
          10:30:26 at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:481)
          10:30:26 at hudson.model.Run.execute(Run.java:1689)
          10:30:26 at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
          10:30:26 at hudson.model.ResourceController.execute(ResourceController.java:88)
          10:30:26 at hudson.model.Executor.run(Executor.java:231)
          10:30:26 Caused by: java.lang.InterruptedException
          10:30:26 at java.lang.Object.wait(Native Method)
          10:30:26 at hudson.remoting.Request.call(Request.java:146)
          10:30:26 at hudson.remoting.Channel.call(Channel.java:722)
          10:30:26 at hudson.FilePath.act(FilePath.java:1003)
          10:30:26 at org.jenkinsci.plugins.envinject.service.EnvironmentVariablesNodeLoader.gatherEnvironmentVariablesNode(EnvironmentVariablesNodeLoader.java:44)
          10:30:26 ... 8 more
          10:30:26 Archiving artifacts
          10:30:26 ERROR: Publisher hudson.tasks.Mailer aborted due to exception
          10:30:26
          hudson.remoting.ChannelClosedException: channel is already closed
          10:30:26 at hudson.remoting.Channel.send(Channel.java:524)
          10:30:26 at hudson.remoting.Request.call(Request.java:129)
          10:30:26 at hudson.remoting.Channel.call(Channel.java:722)
          10:30:26 at hudson.EnvVars.getRemote(EnvVars.java:404)
          10:30:26 at hudson.model.Computer.getEnvironment(Computer.java:911)
          10:30:26 at jenkins.model.CoreEnvironmentContributor.buildEnvironmentFor(CoreEnvironmentContributor.java:29)
          10:30:26 at hudson.model.Run.getEnvironment(Run.java:2202)
          10:30:26 at hudson.model.AbstractBuild.getEnvironment(AbstractBuild.java:873)
          10:30:26 at hudson.tasks.Mailer.perform(Mailer.java:134)
          10:30:26 at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
          10:30:26 at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:744)
          10:30:26 at hudson.model.AbstractBuild$AbstractBuildExecution.performAllBuildSteps(AbstractBuild.java:714)
          10:30:26 at hudson.model.Build$BuildExecution.post2(Build.java:182)
          10:30:26 at hudson.model.AbstractBuild$AbstractBuildExecution.post(AbstractBuild.java:663)
          10:30:26 at hudson.model.Run.execute(Run.java:1714)
          10:30:26 at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
          10:30:26 at hudson.model.ResourceController.execute(ResourceController.java:88)
          10:30:26 at hudson.model.Executor.run(Executor.java:231)
          10:30:26 Caused by: java.io.IOException
          10:30:26 at hudson.remoting.Channel.close(Channel.java:1007)
          10:30:26 at hudson.slaves.ChannelPinger$1.onDead(ChannelPinger.java:110)
          10:30:26 at hudson.remoting.PingThread.ping(PingThread.java:120)
          10:30:26 at hudson.remoting.PingThread.run(PingThread.java:81)
          10:30:26 Caused by: java.util.concurrent.TimeoutException: Ping started on 1409011309670 hasn't completed at 1409011549671
          10:30:26 ... 2 more
          10:30:26 [BFA] Scanning build for known causes...
          10:30:26
          10:30:26 [BFA] Done. 0s
          10:30:26 [EnvInject] - [ERROR] - SEVERE ERROR occurs: channel is already closed
          {noformat}
          Christian Goetze made changes -
          Component/s New: core [ 15593 ]
          Component/s Original: slave-squatter [ 16076 ]
          Daniel Beck made changes -
          Labels New: remoting
          R. Tyler Croy made changes -
          Workflow Original: JNJira [ 157430 ] New: JNJira + In-Review [ 179578 ]
          Oleg Nenashev made changes -
          Component/s New: remoting [ 15489 ]
          Oleg Nenashev made changes -
          Assignee New: Oleg Nenashev [ oleg_nenashev ]
          CloudBees Inc. made changes -
          Remote Link New: This issue links to "CloudBees Internal OSS-1356 (Web Link)" [ 18730 ]
          Oleg Nenashev made changes -
          Assignee Original: Oleg Nenashev [ oleg_nenashev ]

            Unassigned Unassigned
            cg Christian Goetze
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

              Created:
              Updated: