Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-69954

Pipeline: intermittent 'Error initializing storage and loading nodes' message occurs in clusters, and runs end up hanging

XMLWordPrintable

      We have been seeing intermittent failures in our Jenkins logfile when pipeline jobs start. The failures occur a few times a day, and typically occur in clusters – for example, we will see 3 different runs all fail with the same set of messages within a second. This might imply there's a need-for-locking issue.

      From the logfile messages, we see an indication (right when the job starts) that a build record was already found, and this then ends up messing up the run.

      The run will then hang, spinning in the console and repeatedly logging "trying to open a build log on somejob #570 after it has completed", even after the underlying K8s node has gone away. I presume this is a separate, sad-path error handling issue of some sort.

      We see the following sequence of messages logged, right when the run is started. The same set of messages are logged every time this occurs. As I mentioned before, we see these messages logged in clumps – maybe one or two times a day, we will see 3 runs all fail with this set of messages logged, and all messages are logged within a second.

      [2022-10-26 02:59:37][WARNING][org.jenkinsci.plugins.workflow.cps.CpsFlowExecution onLoad] Error initializing storage and loading nodes, will try to create placeholders for: CpsFlowExecution[Owner[somejob/570:si.somejob #570]]
      java.io.IOException: Tried to load head FlowNodes for execution Owner[somejob/570:si.somejob #570] but FlowNode was not found in storage for head id:FlowNodeId 1:6
              at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution.initializeStorage(CpsFlowExecution.java:679)
              at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution.onLoad(CpsFlowExecution.java:716)
              at org.jenkinsci.plugins.workflow.job.WorkflowRun.getExecution(WorkflowRun.java:701)
              at org.jenkinsci.plugins.workflow.job.WorkflowRun.onLoad(WorkflowRun.java:560)
              at hudson.model.RunMap.retrieve(RunMap.java:233)
              at hudson.model.RunMap.retrieve(RunMap.java:61)
              at jenkins.model.lazy.AbstractLazyLoadRunMap.load(AbstractLazyLoadRunMap.java:650)
              at jenkins.model.lazy.AbstractLazyLoadRunMap.load(AbstractLazyLoadRunMap.java:632)
              at jenkins.model.lazy.AbstractLazyLoadRunMap.getByNumber(AbstractLazyLoadRunMap.java:530)
              at jenkins.model.lazy.LazyBuildMixIn.getBuildByNumber(LazyBuildMixIn.java:230)
              at org.jenkinsci.plugins.workflow.job.WorkflowJob.getBuildByNumber(WorkflowJob.java:244)
              at org.jenkinsci.plugins.workflow.job.WorkflowJob.getBuildByNumber(WorkflowJob.java:105)
              at hudson.model.Run.fromExternalizableId(Run.java:2497)
              at org.jenkinsci.plugins.workflow.support.steps.build.RunWrapper.getRawBuild(RunWrapper.java:84)
              at org.jenkinsci.plugins.pipeline.modeldefinition.model.BuildCondition.meetsCondition(BuildCondition.java:75)
              at jdk.internal.reflect.GeneratedMethodAccessor12358.invoke(Unknown Source)
              at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
              at java.base/java.lang.reflect.Method.invoke(Method.java:566)
              at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:98)
              at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:325)
              at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1225)
              at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1034)
              at org.codehaus.groovy.runtime.callsite.PogoMetaClassSite.call(PogoMetaClassSite.java:41)
              at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:144)
              at org.jenkinsci.plugins.pipeline.modeldefinition.model.AbstractBuildConditionResponder$_satisfiedConditions_closure1.doCall(AbstractBuildConditionResponder.groovy:58)
              at jdk.internal.reflect.GeneratedMethodAccessor15234.invoke(Unknown Source)
              at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
              at java.base/java.lang.reflect.Method.invoke(Method.java:566)
              at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:98)
              at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:325)
              at org.codehaus.groovy.runtime.metaclass.ClosureMetaClass.invokeMethod(ClosureMetaClass.java:264)
              at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1034)
              at org.codehaus.groovy.runtime.callsite.PogoMetaClassSite.call(PogoMetaClassSite.java:41)
              at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:47)
              at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:116)
              at org.codehaus.groovy.runtime.callsite.BooleanReturningMethodInvoker.invoke(BooleanReturningMethodInvoker.java:51)
              at org.codehaus.groovy.runtime.callsite.BooleanClosureWrapper.call(BooleanClosureWrapper.java:53)
              at org.codehaus.groovy.runtime.DefaultGroovyMethods.any(DefaultGroovyMethods.java:2455)
              at com.cloudbees.groovy.cps.CpsDefaultGroovyMethods.any(CpsDefaultGroovyMethods.java:80)
              at jdk.internal.reflect.GeneratedMethodAccessor2273.invoke(Unknown Source)
              at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
              at java.base/java.lang.reflect.Method.invoke(Method.java:566)
              at org.codehaus.groovy.runtime.metaclass.ReflectionMetaMethod.invoke(ReflectionMetaMethod.java:54)
              at org.codehaus.groovy.runtime.metaclass.NewInstanceMetaMethod.invoke(NewInstanceMetaMethod.java:56)
              at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:325)
              at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1225)
              at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1034)
              at org.codehaus.groovy.runtime.callsite.PojoMetaClassSite.call(PojoMetaClassSite.java:46)
              at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:128)
              at org.jenkinsci.plugins.pipeline.modeldefinition.model.AbstractBuildConditionResponder.satisfiedConditions(AbstractBuildConditionResponder.groovy:56)
              at jdk.internal.reflect.GeneratedMethodAccessor15233.invoke(Unknown Source)
              at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
              at java.base/java.lang.reflect.Method.invoke(Method.java:566)
              at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:98)
              at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:325)
              at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1225)
              at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1034)
              at org.codehaus.groovy.runtime.callsite.PogoMetaClassSite.call(PogoMetaClassSite.java:41)
              at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:144)
              at org.jenkinsci.plugins.pipeline.modeldefinition.model.Root.hasSatisfiedConditions(Root.groovy:104)
              at jdk.internal.reflect.GeneratedMethodAccessor1502.invoke(Unknown Source)
              at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
              at java.base/java.lang.reflect.Method.invoke(Method.java:566)
              at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:98)
              at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:325)
              at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1225)
              at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1034)
              at org.codehaus.groovy.runtime.callsite.PogoMetaClassSite.call(PogoMetaClassSite.java:41)
              at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:47)
              at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:116)
              at com.cloudbees.groovy.cps.sandbox.DefaultInvoker.methodCall(DefaultInvoker.java:20)
              at com.cloudbees.groovy.cps.impl.ContinuationGroup.methodCall(ContinuationGroup.java:86)
              at com.cloudbees.groovy.cps.impl.FunctionCallBlock$ContinuationImpl.dispatchOrArg(FunctionCallBlock.java:113)
              at com.cloudbees.groovy.cps.impl.FunctionCallBlock$ContinuationImpl.fixArg(FunctionCallBlock.java:83)
              at jdk.internal.reflect.GeneratedMethodAccessor982.invoke(Unknown Source)
              at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
              at java.base/java.lang.reflect.Method.invoke(Method.java:566)
              at com.cloudbees.groovy.cps.impl.ContinuationPtr$ContinuationImpl.receive(ContinuationPtr.java:72)
              at com.cloudbees.groovy.cps.impl.LocalVariableBlock$LocalVariable.get(LocalVariableBlock.java:39)
              at com.cloudbees.groovy.cps.LValueBlock$GetAdapter.receive(LValueBlock.java:30)
              at com.cloudbees.groovy.cps.impl.LocalVariableBlock.evalLValue(LocalVariableBlock.java:28)
              at com.cloudbees.groovy.cps.LValueBlock$BlockImpl.eval(LValueBlock.java:55)
              at com.cloudbees.groovy.cps.LValueBlock.eval(LValueBlock.java:16)
              at com.cloudbees.groovy.cps.Next.step(Next.java:83)
              at com.cloudbees.groovy.cps.Continuable$1.call(Continuable.java:174)
              at com.cloudbees.groovy.cps.Continuable$1.call(Continuable.java:163)
              at org.codehaus.groovy.runtime.GroovyCategorySupport$ThreadCategoryInfo.use(GroovyCategorySupport.java:136)
              at org.codehaus.groovy.runtime.GroovyCategorySupport.use(GroovyCategorySupport.java:275)
              at com.cloudbees.groovy.cps.Continuable.run0(Continuable.java:163)
              at org.jenkinsci.plugins.workflow.cps.SandboxContinuable.access$001(SandboxContinuable.java:18)
              at org.jenkinsci.plugins.workflow.cps.SandboxContinuable.run0(SandboxContinuable.java:51)
              at org.jenkinsci.plugins.workflow.cps.CpsThread.runNextChunk(CpsThread.java:187)
              at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.run(CpsThreadGroup.java:420)
              at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.access$400(CpsThreadGroup.java:95)
              at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:330)
              at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:294)
              at org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService$2.call(CpsVmExecutorService.java:67)
              at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
              at hudson.remoting.SingleLaneExecutorService$1.run(SingleLaneExecutorService.java:139)
              at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:30)
              at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:70)
              at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
              at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
              at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
              at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
              at java.base/java.lang.Thread.run(Thread.java:834)
      
      ...
      
      [2022-10-26 02:59:37][INFO  ][org.jenkinsci.plugins.workflow.cps.CpsFlowExecution createPlaceholderNodes] Creating placeholder flownodes for execution: CpsFlowExecution[Owner[somejob/570:somejob #570]]
      
      ...
      
      [2022-10-26 02:59:37][WARNING][org.jenkinsci.plugins.workflow.job.WorkflowRun getExecution] Found incomplete build with completed execution - display name: somejob #570
      
      ...
      
      [2022-10-26 02:59:37][INFO  ][org.jenkinsci.plugins.workflow.job.WorkflowRun finish] somejob #570 completed: ABORTED
      
      ...
      
      [2022-10-26 02:59:37][WARNING][org.jenkinsci.plugins.workflow.job.WorkflowRun getListener] null
      java.lang.IllegalStateException: trying to open a build log on somejob #570 after it has completed
              at org.jenkinsci.plugins.workflow.job.WorkflowRun.getListener(WorkflowRun.java:231)
              at org.jenkinsci.plugins.workflow.job.WorkflowRun.access$300(WorkflowRun.java:138)
              at org.jenkinsci.plugins.workflow.job.WorkflowRun$Owner.getListener(WorkflowRun.java:1005)
              at org.jenkinsci.plugins.workflow.cps.SandboxContinuable.run0(SandboxContinuable.java:43)
              at org.jenkinsci.plugins.workflow.cps.CpsThread.runNextChunk(CpsThread.java:187)
              at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.run(CpsThreadGroup.java:420)
              at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.access$400(CpsThreadGroup.java:95)
              at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:330)
              at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:294)
              at org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService$2.call(CpsVmExecutorService.java:67)
              at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
              at hudson.remoting.SingleLaneExecutorService$1.run(SingleLaneExecutorService.java:139)
              at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:30)
              at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:70)
              at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
              at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
              at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
              at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
              at java.base/java.lang.Thread.run(Thread.java:834)
      
      ...
      
      ### This 'java.lang.IllegalStateException: trying to open a build log on somejob #570 after it has completed' repeats over-and-over.
      
      

            Unassigned Unassigned
            srothco Steve Roth
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated: