Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-17757

IllegalStateException: Timer already cancelled from NodesCollector.scheduleCollectNow

    XMLWordPrintable

Details

    Description

      A user reported that his (SSH) slaves repeatedly failed to launch with the following error:

      ...
      Copied classworlds.jar 
      Evacuated stdout 
      ERROR: Unexpected error in launching a slave. This is probably a bug in Jenkins. 
      java.lang.IllegalStateException: Timer already cancelled. 
      at java.util.Timer.sched(Timer.java:397) 
      at java.util.Timer.schedule(Timer.java:193) 
      at net.bull.javamelody.NodesCollector.scheduleCollectNow(NodesCollector.java:110) 
      at org.jvnet.hudson.plugins.monitoring.NodesListener.onOnline(NodesListener.java:51) 
      at hudson.slaves.SlaveComputer.setChannel(SlaveComputer.java:472) 
      at hudson.slaves.SlaveComputer.setChannel(SlaveComputer.java:340) 
      at hudson.plugins.sshslaves.SSHLauncher.startSlave(SSHLauncher.java:678) 
      at hudson.plugins.sshslaves.SSHLauncher.launch(SSHLauncher.java:472) 
      at hudson.slaves.SlaveComputer$1.call(SlaveComputer.java:223) 
      at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) 
      at java.util.concurrent.FutureTask.run(FutureTask.java:166) 
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) 
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) 
      at java.lang.Thread.run(Thread.java:722) 
      […] [SSH] Connection closed. 
      ERROR: Connection terminated 
      java.io.IOException: Unexpected termination of the channel 
      at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:50) 
      Caused by: ...
      

      After disabling the Monitoring plugin the problem went away.

      Not sure what the root cause is, but catching IllegalStateException from scheduleCollectNow and reporting it gracefully with diagnostics seems like a good idea.

      Probably SlaveComputer.setChannel should also be trapping exceptions from listeners it calls. hudson.Util may need a convenience method to call a listener method (as a Runnable? for future lambdas) catching any RuntimeException or LinkageError and politely reporting the error so the caller can continue (maybe even blacklisting the listener for future calls so you do not fill up your log, listing the plugin name and version owning the listener class, etc.).

      Attachments

        Issue Links

          Activity

            jglick Jesse Glick created issue -
            jglick Jesse Glick made changes -
            Field Original Value New Value
            Description A user reported that his (SSH) slaves repeatedly failed to launch with the following error:

            {code:none}
            ...
            Copied classworlds.jar
            Evacuated stdout
            ERROR: Unexpected error in launching a slave. This is probably a bug in Jenkins.
            java.lang.IllegalStateException: Timer already cancelled.
            at java.util.Timer.sched(Timer.java:397)
            at java.util.Timer.schedule(Timer.java:193)
            at net.bull.javamelody.NodesCollector.scheduleCollectNow(NodesCollector.java:110)
            at org.jvnet.hudson.plugins.monitoring.NodesListener.onOnline(NodesListener.java:51)
            at hudson.slaves.SlaveComputer.setChannel(SlaveComputer.java:472)
            at hudson.slaves.SlaveComputer.setChannel(SlaveComputer.java:340)
            at hudson.plugins.sshslaves.SSHLauncher.startSlave(SSHLauncher.java:678)
            at hudson.plugins.sshslaves.SSHLauncher.launch(SSHLauncher.java:472)
            at hudson.slaves.SlaveComputer$1.call(SlaveComputer.java:223)
            at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
            at java.util.concurrent.FutureTask.run(FutureTask.java:166)
            at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
            at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
            at java.lang.Thread.run(Thread.java:722)
            [04/17/13 22:18:02] [SSH] Connection closed.
            ERROR: Connection terminated
            java.io.IOException: Unexpected termination of the channel
            at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:50)
            Caused by: ...
            {code}

            After disabling the Monitoring plugin the problem went away.

            Not sure what the root cause is, but catching {{IllegalStateException}} from {{scheduleCollectNow}} and reporting it gracefully with diagnostics seems like a good idea.

            Probably {{SlaveComputer.setChannel}} should also be trapping exceptions from listeners it calls. {{hudson.Util}} may need a convenience method to call a listener method (as a {{Runnable}}? for future lambdas) catching any {{RuntimeException}} or {{LinkageError}} and politely reporting the error so the caller can continue (maybe even blacklisting the listener for future calls so you do not fill up your log, listing the plugin name and version owning the listener class, etc.).
            A user reported that his (SSH) slaves repeatedly failed to launch with the following error:

            {code:none}
            ...
            Copied classworlds.jar
            Evacuated stdout
            ERROR: Unexpected error in launching a slave. This is probably a bug in Jenkins.
            java.lang.IllegalStateException: Timer already cancelled.
            at java.util.Timer.sched(Timer.java:397)
            at java.util.Timer.schedule(Timer.java:193)
            at net.bull.javamelody.NodesCollector.scheduleCollectNow(NodesCollector.java:110)
            at org.jvnet.hudson.plugins.monitoring.NodesListener.onOnline(NodesListener.java:51)
            at hudson.slaves.SlaveComputer.setChannel(SlaveComputer.java:472)
            at hudson.slaves.SlaveComputer.setChannel(SlaveComputer.java:340)
            at hudson.plugins.sshslaves.SSHLauncher.startSlave(SSHLauncher.java:678)
            at hudson.plugins.sshslaves.SSHLauncher.launch(SSHLauncher.java:472)
            at hudson.slaves.SlaveComputer$1.call(SlaveComputer.java:223)
            at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
            at java.util.concurrent.FutureTask.run(FutureTask.java:166)
            at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
            at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
            at java.lang.Thread.run(Thread.java:722)
            […] [SSH] Connection closed.
            ERROR: Connection terminated
            java.io.IOException: Unexpected termination of the channel
            at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:50)
            Caused by: ...
            {code}

            After disabling the Monitoring plugin the problem went away.

            Not sure what the root cause is, but catching {{IllegalStateException}} from {{scheduleCollectNow}} and reporting it gracefully with diagnostics seems like a good idea.

            Probably {{SlaveComputer.setChannel}} should also be trapping exceptions from listeners it calls. {{hudson.Util}} may need a convenience method to call a listener method (as a {{Runnable}}? for future lambdas) catching any {{RuntimeException}} or {{LinkageError}} and politely reporting the error so the caller can continue (maybe even blacklisting the listener for future calls so you do not fill up your log, listing the plugin name and version owning the listener class, etc.).
            evernat evernat added a comment -

            adding me as observer

            evernat evernat added a comment - adding me as observer
            evernat evernat added a comment -

            After looking more into this, I have not understood why a slave would come online when Jenkins seems to be stopping.
            But anyway and to let the slave start, I have added a catch at org.jvnet.hudson.plugins.monitoring.NodesListener.onOnline(NodesListener.java:51)

            It will be in the next release (1.46)

            evernat evernat added a comment - After looking more into this, I have not understood why a slave would come online when Jenkins seems to be stopping. But anyway and to let the slave start, I have added a catch at org.jvnet.hudson.plugins.monitoring.NodesListener.onOnline(NodesListener.java:51) It will be in the next release (1.46)
            evernat evernat made changes -
            Resolution Fixed [ 1 ]
            Status Open [ 1 ] Resolved [ 5 ]

            Code changed in jenkins
            User: evernat
            Path:
            src/main/java/org/jvnet/hudson/plugins/monitoring/NodesListener.java
            http://jenkins-ci.org/commit/monitoring-plugin/cb66c467903706df63ca07af21fc1fb5f50e1e66
            Log:
            [FIX JENKINS-17757] IllegalStateException: Timer already cancelled from NodesCollector.scheduleCollectNow

            scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: evernat Path: src/main/java/org/jvnet/hudson/plugins/monitoring/NodesListener.java http://jenkins-ci.org/commit/monitoring-plugin/cb66c467903706df63ca07af21fc1fb5f50e1e66 Log: [FIX JENKINS-17757] IllegalStateException: Timer already cancelled from NodesCollector.scheduleCollectNow

            Code changed in jenkins
            User: evernat
            Path:
            src/main/java/org/jvnet/hudson/plugins/monitoring/NodesListener.java
            http://jenkins-ci.org/commit/monitoring-plugin/c60ffbdc94a632610e76a79b238683f74edadfa9
            Log:
            [FIX JENKINS-17757] IllegalStateException: Timer already cancelled from NodesCollector.scheduleCollectNow

            scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: evernat Path: src/main/java/org/jvnet/hudson/plugins/monitoring/NodesListener.java http://jenkins-ci.org/commit/monitoring-plugin/c60ffbdc94a632610e76a79b238683f74edadfa9 Log: [FIX JENKINS-17757] IllegalStateException: Timer already cancelled from NodesCollector.scheduleCollectNow
            jglick Jesse Glick made changes -
            Link This issue is related to JENKINS-21224 [ JENKINS-21224 ]
            rtyler R. Tyler Croy made changes -
            Workflow JNJira [ 148937 ] JNJira + In-Review [ 192951 ]

            People

              Unassigned Unassigned
              jglick Jesse Glick
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: