Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-7707

Multiple dead executors on slaves post 1.379 upgrade

    XMLWordPrintable

Details

    • Bug
    • Status: Resolved (View Workflow)
    • Major
    • Resolution: Fixed
    • remoting
    • None
    • CentOS Linux 5.x kernel 2.6.18-194.3.1.el5
      hudson.war 1.379 under Tomcat 5.5.28
      Slave OSs: CentOS Linux 5.x, Windows XP 32bit, Windows Server 2008 64bit

    Description

      Post upgrade to 1.379 we are experiencing increased ocurrances of dead executors on our slave systems. Prior to this release we had never encountered a dead executor on any system, master or slave. Immediately after deploying the 1.379 WAR, 6 executors spread out among a variety of slave platforms (Linux, WinXP 32bit, Win2k8 64bit) died. Today one more died on a Linux slave. Restarting Hudson clears out the dead executors, but disconnecting and reconnecting the slaves does not. I have not tried rebooting the slaves themselves yet. The stack trace below has consistently been the output associated with the dead executors.

      java.lang.AbstractMethodError
      at hudson.model.Executor.getEstimatedRemainingTimeMillis(Executor.java:340)
      at hudson.model.queue.LoadPredictor$CurrentlyRunningTasks.predict(LoadPredictor.java:77)
      at hudson.model.queue.MappingWorksheet.(MappingWorksheet.java:303)
      at hudson.model.Queue.pop(Queue.java:753)
      at hudson.model.Executor.grabJob(Executor.java:175)
      at hudson.model.Executor.run(Executor.java:113)

      Attachments

        Issue Links

          Activity

            I just noticed that at the time the issue appeared, I had both upgraded to 1384, AND set the maximum thread number for SCM polling to 20. Apparently, removing the thread polling limit made the issue disappear. Also, the issue in fact appeared to happen just after the SCM polling for a big project had taken place. I have about 40 projects on the server, and 4 slaves.

            carlo_bonamico carlo_bonamico added a comment - I just noticed that at the time the issue appeared, I had both upgraded to 1384, AND set the maximum thread number for SCM polling to 20. Apparently, removing the thread polling limit made the issue disappear. Also, the issue in fact appeared to happen just after the SCM polling for a big project had taken place. I have about 40 projects on the server, and 4 slaves.
            mindless Alan Harder added a comment -

            The original reporter has mentioned not seeing this issue anymore.. does anyone else still see dead slaves with this exception on the latest Hudson release?

            java.lang.AbstractMethodError
            at hudson.model.Executor.getEstimatedRemainingTimeMillis(Executor.java:340)
            mindless Alan Harder added a comment - The original reporter has mentioned not seeing this issue anymore.. does anyone else still see dead slaves with this exception on the latest Hudson release? java.lang.AbstractMethodError at hudson.model.Executor.getEstimatedRemainingTimeMillis(Executor.java:340)
            usammmy usammmy added a comment -

            Upgraded to .385. We haven't seen this issue for a while.

            usammmy usammmy added a comment - Upgraded to .385. We haven't seen this issue for a while.

            I am not seeing it on 1.385 and latest Batch Task plugin

            carlo_bonamico carlo_bonamico added a comment - I am not seeing it on 1.385 and latest Batch Task plugin
            mindless Alan Harder added a comment -

            Ok, thanks.. closing this out. Reopen if anyone sees this AbstractMethodError on a recent release.

            mindless Alan Harder added a comment - Ok, thanks.. closing this out. Reopen if anyone sees this AbstractMethodError on a recent release.

            People

              Unassigned Unassigned
              dru_n dru_n
              Votes:
              6 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: