Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-49686

NPE in CPS VM thread at WorkflowRun$GraphL.onNewHead

    • Pipeline - April 2018

      I have 2 jobs stuck in the build queue waiting, jobs are apparently waiting for 2 other jobs to complete but the nodes executors are free. I don't if these NPE can cause this behavior but they don't look right anyway.

      Feb 21, 2018 8:38:55 PM org.jenkinsci.plugins.workflow.cps.CpsFlowExecution onLoad
      WARNING: Pipeline state not properly persisted, cannot resume job/ice/job/3.7/221/
      Feb 21, 2018 8:38:55 PM org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService reportProblem
      WARNING: Unexpected exception in CPS VM thread: CpsFlowExecution[Owner[ice/3.7/221:ice/3.7 #221]]
      java.lang.NullPointerException
      at org.jenkinsci.plugins.workflow.job.WorkflowRun$GraphL.onNewHead(WorkflowRun.java:997)
      at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution.notifyListeners(CpsFlowExecution.java:1368)
      at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$3.run(CpsThreadGroup.java:412)
      at org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService$1.run(CpsVmExecutorService.java:35)
      at hudson.remoting.SingleLaneExecutorService$1.run(SingleLaneExecutorService.java:112)
      at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
      at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
      at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
      at java.lang.Thread.run(Thread.java:748)
      
      Feb 21, 2018 8:38:55 PM org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService reportProblem
      WARNING: Unexpected exception in CPS VM thread: CpsFlowExecution[Owner[ice/3.7/221:ice/3.7 #221]]
      java.lang.NullPointerException
      at org.jenkinsci.plugins.workflow.job.WorkflowRun$GraphL.onNewHead(WorkflowRun.java:997)
      at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution.notifyListeners(CpsFlowExecution.java:1368)
      at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$3.run(CpsThreadGroup.java:412)
      at org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService$1.run(CpsVmExecutorService.java:35)
      at hudson.remoting.SingleLaneExecutorService$1.run(SingleLaneExecutorService.java:112)
      at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
      at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
      at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
      at java.lang.Thread.run(Thread.java:748)
      
      

        1. build.xml.gz
          3 kB
        2. build.xml.gz
          3 kB
        3. flowNodeStore.xml.gz
          53 kB
        4. flowNodeStore.xml.gz
          53 kB
        5. jenkins.log.gz
          257 kB
        6. jenkins.log.gz
          257 kB
        7. workflow-cps.hpi
          540 kB
        8. workflow-job.hpi
          110 kB

          [JENKINS-49686] NPE in CPS VM thread at WorkflowRun$GraphL.onNewHead

          Jesse Glick added a comment - - edited

          Just realized this was on my list of “Pipeline test flakes I have not gotten around to filing/fixing yet”:

          java.lang.NullPointerException
          	at org.jenkinsci.plugins.workflow.job.WorkflowRun$GraphL.onNewHead(WorkflowRun.java:959)
          	at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution.notifyListeners(CpsFlowExecution.java:1220)
          	at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$3.run(CpsThreadGroup.java:408)
          	at org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService$1.run(CpsVmExecutorService.java:35)
          	at …
          

          in

          synchronized (completed) {
          

          Seen randomly in some test in org.jenkinsci.plugins.workflow.support.pickles.serialization.SerializationSecurityTest which breaks resumption of a build.

          Jesse Glick added a comment - - edited Just realized this was on my list of “Pipeline test flakes I have not gotten around to filing/fixing yet”: java.lang.NullPointerException at org.jenkinsci.plugins.workflow.job.WorkflowRun$GraphL.onNewHead(WorkflowRun.java:959) at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution.notifyListeners(CpsFlowExecution.java:1220) at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$3.run(CpsThreadGroup.java:408) at org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService$1.run(CpsVmExecutorService.java:35) at … in synchronized (completed) { Seen randomly in some test in org.jenkinsci.plugins.workflow.support.pickles.serialization.SerializationSecurityTest  which breaks resumption of a build.

          Sam Van Oort added a comment -

          jglick Won't that open up other race conditions? Adding an Action to the flowNode is safe, but this can be invoking "finish" on the Run.

          Perhaps the listener should block until completed is non-null?

          Sam Van Oort added a comment - jglick Won't that open up other race conditions? Adding an Action to the flowNode is safe, but this can be invoking "finish" on the Run. Perhaps the listener should block until completed is non-null?

          Sam Van Oort added a comment -

          On second thought... no, not quite, but that feels very thread-unsafe.

          Sam Van Oort added a comment - On second thought... no, not quite, but that feels very thread-unsafe.

          Sam Van Oort added a comment -

          bentoi I've got some proposed fixes that should resolve... well the issue here plus a whole beeshive of related bugs with similar causes. Plus it adds some catchall protections that protect against unseen but possible issues.

          Please could you try out the below plugins (snapshot builds of PRs in final review) and confirm it resolves issues fully for you:

          workflow-cps.hpi
          workflow-job.hpi

          Or if you prefer to build them yourself, the PRs are here:
          https://github.com/jenkinsci/workflow-cps-plugin/pull/216
          https://github.com/jenkinsci/workflow-job-plugin/pull/93

          Thanks!

          Sam Van Oort added a comment - bentoi I've got some proposed fixes that should resolve... well the issue here plus a whole beeshive of related bugs with similar causes. Plus it adds some catchall protections that protect against unseen but possible issues. Please could you try out the below plugins (snapshot builds of PRs in final review) and confirm it resolves issues fully for you: workflow-cps.hpi workflow-job.hpi Or if you prefer to build them yourself, the PRs are here: https://github.com/jenkinsci/workflow-cps-plugin/pull/216 https://github.com/jenkinsci/workflow-job-plugin/pull/93 Thanks!

          Code changed in jenkins
          User: Sam Van Oort
          Path:
          pom.xml
          src/main/java/org/jenkinsci/plugins/workflow/job/WorkflowRun.java
          src/test/java/org/jenkinsci/plugins/workflow/job/WorkflowRunRestartTest.java
          src/test/java/org/jenkinsci/plugins/workflow/job/WorkflowRunTest.java
          http://jenkins-ci.org/commit/workflow-job-plugin/f0c26058f31d4f159a82a3cace52935e93f20701
          Log:
          Merge pull request #93 from svanoort/fix-resume-issues

          Fix resume issues JENKINS-49686 and JENKINS-50199 and JENKINS-50407

          Compare: https://github.com/jenkinsci/workflow-job-plugin/compare/e11cea623f61...f0c26058f31d

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Sam Van Oort Path: pom.xml src/main/java/org/jenkinsci/plugins/workflow/job/WorkflowRun.java src/test/java/org/jenkinsci/plugins/workflow/job/WorkflowRunRestartTest.java src/test/java/org/jenkinsci/plugins/workflow/job/WorkflowRunTest.java http://jenkins-ci.org/commit/workflow-job-plugin/f0c26058f31d4f159a82a3cace52935e93f20701 Log: Merge pull request #93 from svanoort/fix-resume-issues Fix resume issues JENKINS-49686 and JENKINS-50199 and JENKINS-50407 Compare: https://github.com/jenkinsci/workflow-job-plugin/compare/e11cea623f61...f0c26058f31d

          Sam Van Oort added a comment -

          Resolved as of workflow-cps 2.47 and workflow-job 2.18

          Sam Van Oort added a comment - Resolved as of workflow-cps 2.47 and workflow-job 2.18

          Sam Van Oort added a comment -

          At least one report still in JENKINS-50199

          Sam Van Oort added a comment - At least one report still in JENKINS-50199

          bentoi added a comment -

          I no longer see the NullPointerException with the latest plugins however I'm seeing some failures when trying to load the "FlowNodes" 

          Apr 25, 2018 2:21:04 AM org.jenkinsci.plugins.workflow.cps.CpsFlowExecution initializeStorage
          WARNING: Tried to load head FlowNodes for execution Owner[ice/3.7/264:ice/3.7 #264] but FlowNode was not found in storage for head id:FlowNodeId 1:1858
          Apr 25, 2018 2:21:04 AM org.jenkinsci.plugins.workflow.cps.CpsFlowExecution rebuildEmptyGraph
          WARNING: Failed to load pipeline heads, so faking some up for execution CpsFlowExecution[Owner[ice/3.7/264:ice/3.7 #264]]
          Apr 25, 2018 2:21:04 AM org.jenkinsci.plugins.workflow.cps.CpsFlowExecution onLoad
          WARNING: Completed flow without FlowEndNode: CpsFlowExecution[Owner[ice/3.7/264:ice/3.7 #264]] heads:1859::1859:org.jenkinsci.plugins.workflow.graph.FlowStartNode[id=1860]

          These failures are also mentioned on JENKINS-50199. Let me know if you need any additional information related to these.

          bentoi added a comment - I no longer see the NullPointerException with the latest plugins however I'm seeing some failures when trying to load the "FlowNodes"  Apr 25, 2018 2:21:04 AM org.jenkinsci.plugins.workflow.cps.CpsFlowExecution initializeStorage WARNING: Tried to load head FlowNodes for execution Owner[ice/3.7/264:ice/3.7 #264] but FlowNode was not found in storage for head id:FlowNodeId 1:1858 Apr 25, 2018 2:21:04 AM org.jenkinsci.plugins.workflow.cps.CpsFlowExecution rebuildEmptyGraph WARNING: Failed to load pipeline heads, so faking some up for execution CpsFlowExecution[Owner[ice/3.7/264:ice/3.7 #264]] Apr 25, 2018 2:21:04 AM org.jenkinsci.plugins.workflow.cps.CpsFlowExecution onLoad WARNING: Completed flow without FlowEndNode: CpsFlowExecution[Owner[ice/3.7/264:ice/3.7 #264]] heads:1859::1859:org.jenkinsci.plugins.workflow.graph.FlowStartNode[id=1860] These failures are also mentioned on JENKINS-50199 . Let me know if you need any additional information related to these.

          Sam Van Oort added a comment -

          bentoi Appreciate the update – I think I've got a handle on all of the related issues here and test coverage for it. Just working through a few remaining challenges and then review + human testing.

          Sam Van Oort added a comment - bentoi Appreciate the update – I think I've got a handle on all of the related issues here and test coverage for it. Just working through a few remaining challenges and then review + human testing.

          Sam Van Oort added a comment -

          Should be resolved with release of workflow-cps 2.50 and workflow-job 2.21 CC bentoi

          Sam Van Oort added a comment - Should be resolved with release of workflow-cps 2.50 and workflow-job 2.21 CC bentoi

            svanoort Sam Van Oort
            bentoi bentoi
            Votes:
            1 Vote for this issue
            Watchers:
            7 Start watching this issue

              Created:
              Updated:
              Resolved: