Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-63242

Successful builds fails at the end (Could not initialize class hudson.slaves.SlaveComputer)

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Open (View Workflow)
    • Priority: Major
    • Resolution: Unresolved
    • Component/s: matrix-project-plugin
    • Labels:
      None
    • Environment:
    • Similar Issues:

      Description

      When I am running a matrix job, after all the children pass the build successfully, the parent slave suddenly fails at the end of the build.

       

      This happens sometimes, and a reboot always helps to solve the problem, but after some time, it happens again, and we need to reboot the parent slave again.

       

      • it happens only on parent slave in matrix jobs.
      • Lately, we changed the JVM options Maximum Heap size to 7GB when connecting the parent as Jenkins slave, it didn't help either.

       

       

      This is the error it produces:
      FATAL: Remote call on parent_slave failed
      Also: hudson.remoting.Channel$CallSiteStackTrace: Remote call to parent_slave
      at hudson.remoting.Channel.attachCallSiteStackTrace(Channel.java:1741)
      at hudson.remoting.UserRequest$ExceptionResponse.retrieve(UserRequest.java:356)
      at hudson.remoting.Channel.call(Channel.java:955)
      at hudson.Launcher$RemoteLauncher.kill(Launcher.java:1086)
      at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:510)
      at hudson.model.Run.execute(Run.java:1853)
      at hudson.matrix.MatrixBuild.run(MatrixBuild.java:323)
      at hudson.model.ResourceController.execute(ResourceController.java:97)
      at hudson.model.Executor.run(Executor.java:427)
      java.lang.NoClassDefFoundError: Could not initialize class hudson.slaves.SlaveComputer
      at hudson.util.ProcessTree.get(ProcessTree.java:432)
      at hudson.Launcher$RemoteLauncher$KillTask.call(Launcher.java:1103)
      at hudson.Launcher$RemoteLauncher$KillTask.call(Launcher.java:1094)
      at hudson.remoting.UserRequest.perform(UserRequest.java:211)
      at hudson.remoting.UserRequest.perform(UserRequest.java:54)
      at hudson.remoting.Request$2.run(Request.java:369)
      at hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:72)
      at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
      at java.lang.Thread.run(Thread.java:748)
      Caused: java.io.IOException: Remote call on parent_slave failed
      at hudson.remoting.Channel.call(Channel.java:961)
      at hudson.Launcher$RemoteLauncher.kill(Launcher.java:1086)
      at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:510)
      at hudson.model.Run.execute(Run.java:1853)
      at hudson.matrix.MatrixBuild.run(MatrixBuild.java:323)
      at hudson.model.ResourceController.execute(ResourceController.java:97)
      at hudson.model.Executor.run(Executor.java:427)
       

        Attachments

          Activity

          benmag Ben Magriso created issue -
          benmag Ben Magriso made changes -
          Field Original Value New Value
          Environment Jenkins version: 2.204.6
          Matrix Project plugin version: 1.14
          Slave machine: CentOS 7
          Slave java: 1.8.0_252 (x86_64)
          Jenkins version: 2.204.6
          Matrix Project plugin version: 1.14
          Slave machine: CentOS 7
          Slave java version: 1.8.0_252 (x86_64)
          benmag Ben Magriso made changes -
          Description When I am running a matrix job, after all the children pass the build successfully, the parent slave suddenly fails at the end of the build.

           

          this happens sometimes, and a reboot always helps to solve the problem, but after some time, it happens again, and we need to reboot the parent slave again.

           
           * it happens only on parent slave in matrix jobs.
           * Lately, we changed the JVM options Maximum Heap size to 7GB when connecting the parent as Jenkins slave, it didn't help either.

           

           

          This is the error it produces:
          FATAL: Remote call on parent_slave failed
          Also: hudson.remoting.Channel$CallSiteStackTrace: Remote call to parent_slave
          at hudson.remoting.Channel.attachCallSiteStackTrace(Channel.java:1741)
          at hudson.remoting.UserRequest$ExceptionResponse.retrieve(UserRequest.java:356)
          at hudson.remoting.Channel.call(Channel.java:955)
          at hudson.Launcher$RemoteLauncher.kill(Launcher.java:1086)
          at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:510)
          at hudson.model.Run.execute(Run.java:1853)
          at hudson.matrix.MatrixBuild.run(MatrixBuild.java:323)
          at hudson.model.ResourceController.execute(ResourceController.java:97)
          at hudson.model.Executor.run(Executor.java:427)
          java.lang.NoClassDefFoundError: Could not initialize class hudson.slaves.SlaveComputer
          at hudson.util.ProcessTree.get(ProcessTree.java:432)
          at hudson.Launcher$RemoteLauncher$KillTask.call(Launcher.java:1103)
          at hudson.Launcher$RemoteLauncher$KillTask.call(Launcher.java:1094)
          at hudson.remoting.UserRequest.perform(UserRequest.java:211)
          at hudson.remoting.UserRequest.perform(UserRequest.java:54)
          at hudson.remoting.Request$2.run(Request.java:369)
          at hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:72)
          at java.util.concurrent.FutureTask.run(FutureTask.java:266)
          at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
          at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
          at java.lang.Thread.run(Thread.java:748)
          Caused: java.io.IOException: Remote call on parent_slave failed
          at hudson.remoting.Channel.call(Channel.java:961)
          at hudson.Launcher$RemoteLauncher.kill(Launcher.java:1086)
          at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:510)
          at hudson.model.Run.execute(Run.java:1853)
          at hudson.matrix.MatrixBuild.run(MatrixBuild.java:323)
          at hudson.model.ResourceController.execute(ResourceController.java:97)
          at hudson.model.Executor.run(Executor.java:427)
           
          When I am running a matrix job, after all the children pass the build successfully, the parent slave suddenly fails at the end of the build.

           

          This happens sometimes, and a reboot always helps to solve the problem, but after some time, it happens again, and we need to reboot the parent slave again.

           
           * it happens only on parent slave in matrix jobs.
           * Lately, we changed the JVM options Maximum Heap size to 7GB when connecting the parent as Jenkins slave, it didn't help either.

           

           

          This is the error it produces:
           FATAL: Remote call on parent_slave failed
           Also: hudson.remoting.Channel$CallSiteStackTrace: Remote call to parent_slave
           at hudson.remoting.Channel.attachCallSiteStackTrace(Channel.java:1741)
           at hudson.remoting.UserRequest$ExceptionResponse.retrieve(UserRequest.java:356)
           at hudson.remoting.Channel.call(Channel.java:955)
           at hudson.Launcher$RemoteLauncher.kill(Launcher.java:1086)
           at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:510)
           at hudson.model.Run.execute(Run.java:1853)
           at hudson.matrix.MatrixBuild.run(MatrixBuild.java:323)
           at hudson.model.ResourceController.execute(ResourceController.java:97)
           at hudson.model.Executor.run(Executor.java:427)
           java.lang.NoClassDefFoundError: Could not initialize class hudson.slaves.SlaveComputer
           at hudson.util.ProcessTree.get(ProcessTree.java:432)
           at hudson.Launcher$RemoteLauncher$KillTask.call(Launcher.java:1103)
           at hudson.Launcher$RemoteLauncher$KillTask.call(Launcher.java:1094)
           at hudson.remoting.UserRequest.perform(UserRequest.java:211)
           at hudson.remoting.UserRequest.perform(UserRequest.java:54)
           at hudson.remoting.Request$2.run(Request.java:369)
           at hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:72)
           at java.util.concurrent.FutureTask.run(FutureTask.java:266)
           at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
           at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
           at java.lang.Thread.run(Thread.java:748)
           Caused: java.io.IOException: Remote call on parent_slave failed
           at hudson.remoting.Channel.call(Channel.java:961)
           at hudson.Launcher$RemoteLauncher.kill(Launcher.java:1086)
           at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:510)
           at hudson.model.Run.execute(Run.java:1853)
           at hudson.matrix.MatrixBuild.run(MatrixBuild.java:323)
           at hudson.model.ResourceController.execute(ResourceController.java:97)
           at hudson.model.Executor.run(Executor.java:427)
            
          benmag Ben Magriso made changes -
          Environment Jenkins version: 2.204.6
          Matrix Project plugin version: 1.14
          Slave machine: CentOS 7
          Slave java version: 1.8.0_252 (x86_64)
          Jenkins version: 2.204.6
          Matrix Project plugin version: 1.17
          Slave machine: CentOS 7
          Slave java version: 1.8.0_252 (x86_64)
          Hide
          shuky_r Shuky Riechard added a comment -

          Adding info on this issue: we heavily use the matrix project plugin for 3-4 years already

          Since we upgraded Jenkins to 2.204.6 / OpenJDK 1.8.0_252 we experience this problem on 2 different Jenkins servers

          From time to time, after all children pass successfully, the parent build fails with this error:
          java.lang.NoClassDefFoundError: Could not initialize class hudson.slaves.SlaveComputer

          Since this failure - it continues to fail on all next builds running afterwards.
          In order to fix it we have to reboot the parent slave.   until it happens again.
          It can happen when only 1-2 parent builds are running on this slave, or when more than 10 builds.

          Show
          shuky_r Shuky Riechard added a comment - Adding info on this issue: we heavily use the matrix project plugin for 3-4 years already Since we upgraded Jenkins to 2.204.6 / OpenJDK 1.8.0_252 we experience this problem on 2 different Jenkins servers !  From time to time, after all children pass successfully, the parent build fails with this error: java.lang.NoClassDefFoundError: Could not initialize class hudson.slaves.SlaveComputer Since this failure - it continues to fail on all next builds running afterwards. In order to fix it we have to reboot the parent slave.   until it happens again. It can happen when only 1-2 parent builds are running on this slave, or when more than 10 builds.
          shuky_r Shuky Riechard made changes -
          Environment Jenkins version: 2.204.6
          Matrix Project plugin version: 1.17
          Slave machine: CentOS 7
          Slave java version: 1.8.0_252 (x86_64)
          Jenkins version: 2.204.6
          Matrix Project plugin version: 1.17
          Jenkins master machine: CentOS 7.7.1908, Java openjdk version 1.8.0_252
          Parent Slave machine: CentOS 7.8.2003, Java openjdk version 1.8.0_252
          Child Slave machine: CentOS 7.4.1708, Java openjdk version 1.8.0_144
          shuky_r Shuky Riechard made changes -
          Priority Minor [ 4 ] Major [ 3 ]
          Hide
          faizan Muhammad Faizan ul haq added a comment -

          Could you try to update jdk to 1.8.0:252 on the child slave machine ?

          Show
          faizan Muhammad Faizan ul haq added a comment - Could you try to update jdk to 1.8.0:252 on the child slave machine ?
          Hide
          benmag Ben Magriso added a comment -

          Hi Muhammad Faizan ul haq,

           

          I tried to update the JDK to 1.8.0:252 on the child slave machine, it didn't help either.

          Show
          benmag Ben Magriso added a comment - Hi  Muhammad Faizan ul haq ,   I tried to update the JDK to 1.8.0:252 on the child slave machine, it didn't help either.
          Hide
          bladend Adam Romanek added a comment - - edited

          Hi, I came across this issue by accident but me and my team have been facing a similar one for a long time. To me it seems that this issue is related to JENKINS-61103. I left a comment in it describing my findings. A potential fix has recently been merged into Jenkins remoting library, but it has not been published in any official Jenkins release yet.

          Also to my best knowledge the issue is usually triggered when a job is aborted on a slave which effectively means the corresponding Java thread on Jenkins is interrupted. If it happens during Java class initialization handled by Jenkins remoting library then the slave becomes "corrupted" and fails builds until disconnected and reconnected again. In our case the problem was triggered by Gerrit Trigger plugin when a new patchset for the same change was pushed (this was our case). The issue could also be amplified by bugs like the one fixed in gerrit-trigger-plugin PR#361 which was effectively causing the same build to be aborted (thus the corresponding Java thread interrupted) multiple times instead of just once.

          Show
          bladend Adam Romanek added a comment - - edited Hi, I came across this issue by accident but me and my team have been facing a similar one for a long time. To me it seems that this issue is related to  JENKINS-61103 . I left a comment in it describing my findings. A potential fix has recently been merged into Jenkins remoting library, but it has not been published in any official Jenkins release yet. Also to my best knowledge the issue is usually triggered when a job is aborted on a slave which effectively means the corresponding Java thread on Jenkins is interrupted. If it happens during Java class initialization handled by Jenkins remoting library then the slave becomes "corrupted" and fails builds until disconnected and reconnected again. In our case the problem was triggered by Gerrit Trigger plugin when a new patchset for the same change was pushed (this was our case). The issue could also be amplified by bugs like the one fixed in  gerrit-trigger-plugin PR#361 which was effectively causing the same build to be aborted (thus the corresponding Java thread interrupted) multiple times instead of just once.

            People

            Assignee:
            kohsuke Kohsuke Kawaguchi
            Reporter:
            benmag Ben Magriso
            Votes:
            2 Vote for this issue
            Watchers:
            4 Start watching this issue

              Dates

              Created:
              Updated: