Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-32986

hard killing a pipeline leaves the JVM CPS thread running.

    • Icon: Improvement Improvement
    • Resolution: Unresolved
    • Icon: Minor Minor
    • workflow-cps-plugin
    • None
    • pipeline 1.13
      jenkins 1.642.1

      In the event a pipeline build will not die you can hard kill it - however hard killing it will leave the JVMs CPS thread still running on the master.

      e.g. with the script

      def spin() {
          while (true) {}
      }
      
      def map = [:]
      map ["spin_it"] = { spin() } 
      }
      parallel map
      

      you will need to hard kill it to stop it (on windows at least) - but inspecting the JVM threads you can see the CPS thread is still running in a tight loop.
      A hard kill should probably (if it is safe without causing deadlocks elsewhere) brutally kill the thread as well. After a while you may run out of handles or other native resources due to the thread usage, meaning you need to restart Jenkins to get it working again.

          [JENKINS-32986] hard killing a pipeline leaves the JVM CPS thread running.

          James Nord added a comment -

          Not sure it can be blocking something that is fixed but here goes.

          James Nord added a comment - Not sure it can be blocking something that is fixed but here goes.

          Jesse Glick added a comment -

          Picking up some stuff from JENKINS-25623:

          • If the CPS VM is running native code, Thread.interrupt should be called. It should be given a limited grace period—say, a few seconds—to terminate; after that, resort to Thread.stop, making sure we are able to provide a fresh Thread for the pool so we can still run finally blocks or whatever.
          • We may also need some sort of per-build CPS VM CPU quota, distinct from timeout in that we do not care about wall clock time spent running a shell script on an agent, we just care about not overloading the master. Alternately, if a given build starts taking too much CPU time (measurable via System.nanoTime around runNextChunk), gradually being delaying its chunk execution (i.e., CpsThreadGroup.scheduleRun may call schedule rather than submit) so that it does not hog the system, and also institute a hard time limit for individual chunks (such as slow native methods).

          Jesse Glick added a comment - Picking up some stuff from JENKINS-25623 : If the CPS VM is running native code, Thread.interrupt should be called. It should be given a limited grace period—say, a few seconds—to terminate; after that, resort to Thread.stop , making sure we are able to provide a fresh Thread for the pool so we can still run finally blocks or whatever. We may also need some sort of per-build CPS VM CPU quota, distinct from timeout in that we do not care about wall clock time spent running a shell script on an agent, we just care about not overloading the master. Alternately, if a given build starts taking too much CPU time (measurable via System.nanoTime around runNextChunk ), gradually being delaying its chunk execution (i.e., CpsThreadGroup.scheduleRun may call schedule rather than submit ) so that it does not hog the system, and also institute a hard time limit for individual chunks (such as slow native methods).

          Jesse Glick added a comment -

          Ran across a situation where a build started but did not print any output other than its causes and had to be hard-killed. Turned out its CPS VM thread was consuming 100% CPU indefinitely:

          	at org.jboss.marshalling.reflect.UnlockedHashMap.doPut(UnlockedHashMap.java:201)
          	at org.jboss.marshalling.reflect.UnlockedHashMap.putIfAbsent(UnlockedHashMap.java:300)
          	at org.jboss.marshalling.reflect.SerializableClassRegistry.lookup(SerializableClassRegistry.java:73)
          	at org.jboss.marshalling.river.RiverMarshaller.doWriteObject(RiverMarshaller.java:177)
          	at org.jboss.marshalling.river.RiverMarshaller.doWriteFields(RiverMarshaller.java:1032)
          	at org.jboss.marshalling.river.RiverMarshaller.doWriteSerializableObject(RiverMarshaller.java:988)
          	at org.jboss.marshalling.river.RiverMarshaller.doWriteSerializableObject(RiverMarshaller.java:967)
          	at org.jboss.marshalling.river.RiverMarshaller.doWriteSerializableObject(RiverMarshaller.java:967)
          	at org.jboss.marshalling.river.RiverMarshaller.doWriteObject(RiverMarshaller.java:854)
          	at org.jboss.marshalling.river.BlockMarshaller.doWriteObject(BlockMarshaller.java:65)
          	at org.jboss.marshalling.river.BlockMarshaller.writeObject(BlockMarshaller.java:56)
          	at org.jboss.marshalling.MarshallerObjectOutputStream.writeObjectOverride(MarshallerObjectOutputStream.java:50)
          	at org.jboss.marshalling.river.RiverObjectOutputStream.writeObjectOverride(RiverObjectOutputStream.java:179)
          	at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:344)
          	at java.util.HashMap.internalWriteEntries(HashMap.java:1777)
          	at java.util.HashMap.writeObject(HashMap.java:1354)
          	at sun.reflect.GeneratedMethodAccessor113.invoke(Unknown Source)
          	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
          	at java.lang.reflect.Method.invoke(Method.java:498)
          	at org.jboss.marshalling.reflect.SerializableClass.callWriteObject(SerializableClass.java:271)
          	at org.jboss.marshalling.river.RiverMarshaller.doWriteSerializableObject(RiverMarshaller.java:976)
          	at org.jboss.marshalling.river.RiverMarshaller.doWriteObject(RiverMarshaller.java:854)
          	at org.jboss.marshalling.river.RiverMarshaller.doWriteFields(RiverMarshaller.java:1032)
          	at org.jboss.marshalling.river.RiverMarshaller.doWriteSerializableObject(RiverMarshaller.java:988)
          	at org.jboss.marshalling.river.RiverMarshaller.doWriteObject(RiverMarshaller.java:854)
          	at org.jboss.marshalling.AbstractObjectOutput.writeObject(AbstractObjectOutput.java:58)
          	at org.jboss.marshalling.AbstractMarshaller.writeObject(AbstractMarshaller.java:111)
          	at org.jenkinsci.plugins.workflow.support.pickles.serialization.RiverWriter.writeObject(RiverWriter.java:132)
          	at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.saveProgram(CpsThreadGroup.java:465)
          	at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.saveProgram(CpsThreadGroup.java:444)
          	at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.run(CpsThreadGroup.java:394)
          	at …
          

          Looking at the code

                      while (threshold < Integer.MAX_VALUE && newSize > threshold) {
                          if (sizeUpdater.compareAndSet(table, newSize, newSize | 0x80000000)) { // ← HERE
                              resize(table);
                              return nonexistent();
                          }
                      }
          

          I am guessing we hit an infinite loop somehow. Seems to be JBMAR-189 which I guess will go into 1.4.12.Final.

          Jesse Glick added a comment - Ran across a situation where a build started but did not print any output other than its causes and had to be hard-killed. Turned out its CPS VM thread was consuming 100% CPU indefinitely: at org.jboss.marshalling.reflect.UnlockedHashMap.doPut(UnlockedHashMap.java:201) at org.jboss.marshalling.reflect.UnlockedHashMap.putIfAbsent(UnlockedHashMap.java:300) at org.jboss.marshalling.reflect.SerializableClassRegistry.lookup(SerializableClassRegistry.java:73) at org.jboss.marshalling.river.RiverMarshaller.doWriteObject(RiverMarshaller.java:177) at org.jboss.marshalling.river.RiverMarshaller.doWriteFields(RiverMarshaller.java:1032) at org.jboss.marshalling.river.RiverMarshaller.doWriteSerializableObject(RiverMarshaller.java:988) at org.jboss.marshalling.river.RiverMarshaller.doWriteSerializableObject(RiverMarshaller.java:967) at org.jboss.marshalling.river.RiverMarshaller.doWriteSerializableObject(RiverMarshaller.java:967) at org.jboss.marshalling.river.RiverMarshaller.doWriteObject(RiverMarshaller.java:854) at org.jboss.marshalling.river.BlockMarshaller.doWriteObject(BlockMarshaller.java:65) at org.jboss.marshalling.river.BlockMarshaller.writeObject(BlockMarshaller.java:56) at org.jboss.marshalling.MarshallerObjectOutputStream.writeObjectOverride(MarshallerObjectOutputStream.java:50) at org.jboss.marshalling.river.RiverObjectOutputStream.writeObjectOverride(RiverObjectOutputStream.java:179) at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:344) at java.util.HashMap.internalWriteEntries(HashMap.java:1777) at java.util.HashMap.writeObject(HashMap.java:1354) at sun.reflect.GeneratedMethodAccessor113.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.jboss.marshalling.reflect.SerializableClass.callWriteObject(SerializableClass.java:271) at org.jboss.marshalling.river.RiverMarshaller.doWriteSerializableObject(RiverMarshaller.java:976) at org.jboss.marshalling.river.RiverMarshaller.doWriteObject(RiverMarshaller.java:854) at org.jboss.marshalling.river.RiverMarshaller.doWriteFields(RiverMarshaller.java:1032) at org.jboss.marshalling.river.RiverMarshaller.doWriteSerializableObject(RiverMarshaller.java:988) at org.jboss.marshalling.river.RiverMarshaller.doWriteObject(RiverMarshaller.java:854) at org.jboss.marshalling.AbstractObjectOutput.writeObject(AbstractObjectOutput.java:58) at org.jboss.marshalling.AbstractMarshaller.writeObject(AbstractMarshaller.java:111) at org.jenkinsci.plugins.workflow.support.pickles.serialization.RiverWriter.writeObject(RiverWriter.java:132) at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.saveProgram(CpsThreadGroup.java:465) at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.saveProgram(CpsThreadGroup.java:444) at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.run(CpsThreadGroup.java:394) at … Looking at the code while (threshold < Integer .MAX_VALUE && newSize > threshold) { if (sizeUpdater.compareAndSet(table, newSize, newSize | 0x80000000)) { // ← HERE resize(table); return nonexistent(); } } I am guessing we hit an infinite loop somehow. Seems to be JBMAR-189 which I guess will go into 1.4.12.Final.

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Jesse Glick Path: pom.xml http://jenkins-ci.org/commit/workflow-support-plugin/e05ab1249db1fa63bd5dcfcbd55c689cb63af36e Log: JENKINS-32986 Noting need for https://github.com/jboss-remoting/jboss-marshalling/pull/48 .

          Code changed in jenkins
          User: Jesse Glick
          Path:
          src/main/java/org/jenkinsci/plugins/workflow/support/concurrent/Timeout.java
          src/test/java/org/jenkinsci/plugins/workflow/support/concurrent/TimeoutTest.java
          http://jenkins-ci.org/commit/workflow-support-plugin/c810b3874134f60be670d1205b6673fde5003c14
          Log:
          JENKINS-32986 Introducing a general-purpose Timeout utility.

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Jesse Glick Path: src/main/java/org/jenkinsci/plugins/workflow/support/concurrent/Timeout.java src/test/java/org/jenkinsci/plugins/workflow/support/concurrent/TimeoutTest.java http://jenkins-ci.org/commit/workflow-support-plugin/c810b3874134f60be670d1205b6673fde5003c14 Log: JENKINS-32986 Introducing a general-purpose Timeout utility.

          Code changed in jenkins
          User: Jesse Glick
          Path:
          src/main/java/org/jenkinsci/plugins/workflow/support/concurrent/Timeout.java
          src/test/java/org/jenkinsci/plugins/workflow/support/concurrent/TimeoutTest.java
          http://jenkins-ci.org/commit/workflow-support-plugin/957d76a5538747f85db6f9ae33f076ee435f534b
          Log:
          Merge pull request #29 from jglick/Timeout-JENKINS-32986

          JENKINS-32986 Introducing a general-purpose Timeout utility

          Compare: https://github.com/jenkinsci/workflow-support-plugin/compare/eb031e04d6e3...957d76a55387

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Jesse Glick Path: src/main/java/org/jenkinsci/plugins/workflow/support/concurrent/Timeout.java src/test/java/org/jenkinsci/plugins/workflow/support/concurrent/TimeoutTest.java http://jenkins-ci.org/commit/workflow-support-plugin/957d76a5538747f85db6f9ae33f076ee435f534b Log: Merge pull request #29 from jglick/Timeout- JENKINS-32986 JENKINS-32986 Introducing a general-purpose Timeout utility Compare: https://github.com/jenkinsci/workflow-support-plugin/compare/eb031e04d6e3...957d76a55387

          Code changed in jenkins
          User: Jesse Glick
          Path:
          pom.xml
          src/main/java/org/jenkinsci/plugins/workflow/cps/CpsBodyExecution.java
          src/main/java/org/jenkinsci/plugins/workflow/cps/CpsThread.java
          http://jenkins-ci.org/commit/workflow-cps-plugin/c0deed0a3b546ebcb59ea25681ed3ac8b13fe6bb
          Log:
          JENKINS-32986 Apply a timeout to some hang-prone operations in the CPS VM thread.

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Jesse Glick Path: pom.xml src/main/java/org/jenkinsci/plugins/workflow/cps/CpsBodyExecution.java src/main/java/org/jenkinsci/plugins/workflow/cps/CpsThread.java http://jenkins-ci.org/commit/workflow-cps-plugin/c0deed0a3b546ebcb59ea25681ed3ac8b13fe6bb Log: JENKINS-32986 Apply a timeout to some hang-prone operations in the CPS VM thread.

          Code changed in jenkins
          User: Jesse Glick
          Path:
          pom.xml
          src/main/java/org/jenkinsci/plugins/workflow/cps/CpsBodyExecution.java
          src/main/java/org/jenkinsci/plugins/workflow/cps/CpsThread.java
          http://jenkins-ci.org/commit/workflow-cps-plugin/51c02d40783bdc2be4e825d29c4c28286aa8c1dc
          Log:
          Merge pull request #102 from jglick/Timeout-JENKINS-32986

          JENKINS-32986 Apply a timeout to some hang-prone operations in the CPS VM thread

          Compare: https://github.com/jenkinsci/workflow-cps-plugin/compare/8da4ed31126f...51c02d40783b

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Jesse Glick Path: pom.xml src/main/java/org/jenkinsci/plugins/workflow/cps/CpsBodyExecution.java src/main/java/org/jenkinsci/plugins/workflow/cps/CpsThread.java http://jenkins-ci.org/commit/workflow-cps-plugin/51c02d40783bdc2be4e825d29c4c28286aa8c1dc Log: Merge pull request #102 from jglick/Timeout- JENKINS-32986 JENKINS-32986 Apply a timeout to some hang-prone operations in the CPS VM thread Compare: https://github.com/jenkinsci/workflow-cps-plugin/compare/8da4ed31126f...51c02d40783b

          Jesse Glick added a comment -

          workflow-support PR 37 should fix the SerializableClassRegistry issue.

          Jesse Glick added a comment - workflow-support PR 37 should fix the SerializableClassRegistry issue.

            jglick Jesse Glick
            teilo James Nord
            Votes:
            4 Vote for this issue
            Watchers:
            14 Start watching this issue

              Created:
              Updated: