Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-55287

Pipeline: Failure to load flow node: FlowNode was not found in storage for head

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Minor Minor
    • workflow-cps-plugin
    • None
    • Jenkins ver. 2.138.2 Pipeline: Groovy 2.61

      IMPORTANT: NOTE FROM A MAINTAINER:

      STOP! YOUR STACK TRACE ALONE IS NOT GOING TO HELP SOLVE THIS!

      (sorry to all caps but we're not going to make progress on this issue with commenters adding insufficient information)

      Note from maintainer: We'd like to be able to fix this, but we really need more information to do so. Please, whenever you encounter the error in the description of the ticket, zip the build folder ($JENKINS_HOME/jobs/$PATH_TO_JOB/builds/$BUILD_NUMBER/) of the build that failed and upload it here along with the Jenkins system logs, redacting any sensitive content as necessary, and include any relevant information on frequency of the issue, steps to reproduce (did it happen after Jenkins was restarted normally, or did Jenkins crash), any messages in the Jenkins system logs that seem relevant, etc. In addition, please check service or other system level logs for Jenkins to see if there are any issues with Jenkins taking too long to shut down or anything like that. Thanks!

      The main thing we are currently looking for is whether these messages are present in the Jenkins logs right before Jenkins shut down for the build which has the error:

      • About to try to checkpoint the program for buildCpsFlowExecutionOwner[YourJobName/BuildNumber:YourJobName #BuildNumber]]
      • Trying to save program before shutdown org.jenkinsci.plugins.workflow.cps.CpsFlowExecution$8@RandomHash
      • Finished saving program before shutdown org.jenkinsci.plugins.workflow.cps.CpsFlowExecution$8@RandomHash

      If these messages are not present, it means that Jenkins was unable to save the Pipeline, so the error is expected. If that is the case, fixing the issue probably requires changes to Jenkins packaging to configure longer service timeouts on shutdown, or totally changing how PERFORMANCE_OPTIMIZED works. If the messages are present, then something else is happening.

      Exception:

      Creating placeholder flownodes because failed loading originals.
      java.io.IOException: Tried to load head FlowNodes for execution Owner[Platform Service FBI Test/1605:Platform Service FBI Test #1605] but FlowNode was not found in storage for head id:FlowNodeId 1:17
       at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution.initializeStorage(CpsFlowExecution.java:678)
       at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution.onLoad(CpsFlowExecution.java:715)
       at org.jenkinsci.plugins.workflow.job.WorkflowRun.getExecution(WorkflowRun.java:659)
       at org.jenkinsci.plugins.workflow.job.WorkflowRun.onLoad(WorkflowRun.java:525)
       at hudson.model.RunMap.retrieve(RunMap.java:225)
       at hudson.model.RunMap.retrieve(RunMap.java:57)
       at jenkins.model.lazy.AbstractLazyLoadRunMap.load(AbstractLazyLoadRunMap.java:499)
       at jenkins.model.lazy.AbstractLazyLoadRunMap.load(AbstractLazyLoadRunMap.java:481)
       at jenkins.model.lazy.AbstractLazyLoadRunMap.getByNumber(AbstractLazyLoadRunMap.java:379)
       at hudson.model.RunMap.getById(RunMap.java:205)
       at org.jenkinsci.plugins.workflow.job.WorkflowRun$Owner.run(WorkflowRun.java:896)
       at org.jenkinsci.plugins.workflow.job.WorkflowRun$Owner.get(WorkflowRun.java:907)
       at org.jenkinsci.plugins.workflow.flow.FlowExecutionList$1.computeNext(FlowExecutionList.java:65)
       at org.jenkinsci.plugins.workflow.flow.FlowExecutionList$1.computeNext(FlowExecutionList.java:57)
       at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
       at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
       at org.jenkinsci.plugins.workflow.flow.FlowExecutionList$ItemListenerImpl.onLoaded(FlowExecutionList.java:178)
       at jenkins.model.Jenkins.<init>(Jenkins.java:975)
       at hudson.model.Hudson.<init>(Hudson.java:85)
       at hudson.model.Hudson.<init>(Hudson.java:81)
       at hudson.WebAppMain$3.run(WebAppMain.java:233)
      Finished: FAILURE
      

        1. 3219.zip
          398 kB
        2. 48064.tar.gz
          85 kB
        3. flowNodeStore.xml
          22 kB
        4. plugins_versions_2.190.1.txt
          5 kB

          [JENKINS-55287] Pipeline: Failure to load flow node: FlowNode was not found in storage for head

          Oleg Nenashev added a comment -

          Restored the priority set by the reporters

          Oleg Nenashev added a comment - Restored the priority set by the reporters

          Devin Nusbaum added a comment - - edited

          famod You can email it to me: dnusbaum at cloudbees.com. And no, I don't need to see workspace files. The main thing I want to see is the Jenkins system logs and or/service logs around the time of the last Jenkins restart to see if the build in question was able to be saved before the shutdown or not.

          Devin Nusbaum added a comment - - edited famod You can email it to me: dnusbaum at cloudbees.com. And no, I don't need to see workspace files. The main thing I want to see is the Jenkins system logs and or/service logs around the time of the last Jenkins restart to see if the build in question was able to be saved before the shutdown or not.

          I'm not able to provide data due to company rules, but as an additional datapoint, we experienced this when the job in question lost connection to the node, not from a jenkins restart.

           

          [2020-07-24T13:47:01.622Z] Cannot contact someNodeNameHere: java.lang.InterruptedException
          Creating placeholder flownodes because failed loading originals.
          java.io.IOException: Tried to load head FlowNodes for execution Owner[SomeFolder/SomeJob/SomeBuild:SomeFolder/SomeJob #SomeBuild] but FlowNode was not found in storage for head id:FlowNodeId 1:1714
          

           

           

          Mark Hollingsworth added a comment - I'm not able to provide data due to company rules, but as an additional datapoint, we experienced this when the job in question lost connection to the node, not from a jenkins restart.   [2020-07-24T13:47:01.622Z] Cannot contact someNodeNameHere: java.lang.InterruptedException Creating placeholder flownodes because failed loading originals. java.io.IOException: Tried to load head FlowNodes for execution Owner[SomeFolder/SomeJob/SomeBuild:SomeFolder/SomeJob #SomeBuild] but FlowNode was not found in storage for head id:FlowNodeId 1:1714    

          Devin Nusbaum added a comment - - edited

          mhollingsworthcs Thanks for the feedback. As far as I can tell though there is no way for the "Creating placeholder flownodes because failed loading originals" message to be printed to the build log unless the Pipeline is resuming, which should only happen when Jenkins starts (although bugs in historical builds that have a broken state could also cause them to attempt to resume when they shouldn't). The full stack trace would help clarify, if "jenkins.model.Jenkins.<init>" is part of the stack trace for the exception, then Jenkins is starting up.

          As a general update, my current understanding of this problem based on the data I have received is that in most cases it happens for Pipelines using the PERFORMANCE_OPTIMIZED durability level when Jenkins crashes. The PERFORMANCE_OPTIMIZED durability level makes no guarantees that Pipelines will be resumable if Jenkins crashes, so this behavior is expected in that case. There is supposed to be a more user-friendly error message explaining that the Pipeline cannot be resumed for these reasons rather than just the raw error of what exactly kept the Pipeline from resuming, but that is broken because of JENKINS-53358.

          I have a draft PR up to try to improve the messaging around this case, so that something like the following would be printed in these cases instead:

          Unable to resume because the Pipeline is using the PERFORMANCE_OPTIMIZED durability level but was not saved before Jenkins stopped. Did Jenkins crash? See https://www.jenkins.io/doc/book/pipeline/scaling-pipeline/ for details about Pipeline durability levels.

          If anyone has seen this issue with Pipelines that are not using the PERFORMANCE_OPTIMIZED durability level, or can show Jenkins system logs and service logs that show this issue occurring with a PERFORMANCE_OPTIMIZED Pipeline even with a normal Jenkins restart, with log messages as described in this comment showing that the Pipeline was persisted before shutdown, that would be very interesting and we should create new tickets to track those things because they would be distinct issues. For some of the stack traces here, it looks like there is a problem where CpsStepContext.isReady is resulting in Pipelines being resumed, which is strange; I am not sure how to reproduce those issues and they probably need to be investigated separately.

          Devin Nusbaum added a comment - - edited mhollingsworthcs Thanks for the feedback. As far as I can tell though there is no way for the "Creating placeholder flownodes because failed loading originals" message to be printed to the build log unless the Pipeline is resuming, which should only happen when Jenkins starts (although bugs in historical builds that have a broken state could also cause them to attempt to resume when they shouldn't). The full stack trace would help clarify, if "jenkins.model.Jenkins.<init>" is part of the stack trace for the exception, then Jenkins is starting up. As a general update, my current understanding of this problem based on the data I have received is that in most cases it happens for Pipelines using the PERFORMANCE_OPTIMIZED durability level when Jenkins crashes. The PERFORMANCE_OPTIMIZED durability level makes no guarantees that Pipelines will be resumable if Jenkins crashes, so this behavior is expected in that case. There is supposed to be a more user-friendly error message explaining that the Pipeline cannot be resumed for these reasons rather than just the raw error of what exactly kept the Pipeline from resuming, but that is broken because of JENKINS-53358 . I have a draft PR up to try to improve the messaging around this case, so that something like the following would be printed in these cases instead: Unable to resume because the Pipeline is using the PERFORMANCE_OPTIMIZED durability level but was not saved before Jenkins stopped. Did Jenkins crash? See https://www.jenkins.io/doc/book/pipeline/scaling-pipeline/ for details about Pipeline durability levels. If anyone has seen this issue with Pipelines that are not using the PERFORMANCE_OPTIMIZED durability level, or can show Jenkins system logs and service logs that show this issue occurring with a PERFORMANCE_OPTIMIZED Pipeline even with a normal Jenkins restart, with log messages as described in this comment showing that the Pipeline was persisted before shutdown, that would be very interesting and we should create new tickets to track those things because they would be distinct issues. For some of the stack traces here, it looks like there is a problem where CpsStepContext.isReady is resulting in Pipelines being resumed, which is strange; I am not sure how to reproduce those issues and they probably need to be investigated separately.

           

           

          dnusbaum so we definitely see jenkins.model.Jenkins.<init> as part of the trace, but I'm almost 100% certain that Jenkins did not startup (I would have been paged very early in the morning if it had). What I can tell you is that we have been reproducing this issue by terminating machines. 

          [2020-07-24T13:47:01.622Z] Cannot contact someNode: java.lang.InterruptedException
          Creating placeholder flownodes because failed loading originals.
          java.io.IOException: Tried to load head FlowNodes for execution Owner[someFolder/someJob/buildNumber:someFolder/someJob #buildNumber] but FlowNode was not found in storage for head id:FlowNodeId 1:1714
           at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution.initializeStorage(CpsFlowExecution.java:679)
           at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution.onLoad(CpsFlowExecution.java:716)
           at org.jenkinsci.plugins.workflow.job.WorkflowRun.getExecution(WorkflowRun.java:680)
           at org.jenkinsci.plugins.workflow.job.WorkflowRun.onLoad(WorkflowRun.java:539)
           at hudson.model.RunMap.retrieve(RunMap.java:225)
           at hudson.model.RunMap.retrieve(RunMap.java:57)
           at jenkins.model.lazy.AbstractLazyLoadRunMap.load(AbstractLazyLoadRunMap.java:501)
           at jenkins.model.lazy.AbstractLazyLoadRunMap.load(AbstractLazyLoadRunMap.java:483)
           at jenkins.model.lazy.AbstractLazyLoadRunMap.getByNumber(AbstractLazyLoadRunMap.java:381)
           at hudson.model.RunMap.getById(RunMap.java:205)
           at org.jenkinsci.plugins.workflow.job.WorkflowRun$Owner.run(WorkflowRun.java:929)
           at org.jenkinsci.plugins.workflow.job.WorkflowRun$Owner.get(WorkflowRun.java:940)
           at org.jenkinsci.plugins.workflow.flow.FlowExecutionList$1.computeNext(FlowExecutionList.java:65)
           at org.jenkinsci.plugins.workflow.flow.FlowExecutionList$1.computeNext(FlowExecutionList.java:57)
           at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
           at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
           at org.jenkinsci.plugins.workflow.flow.FlowExecutionList$ItemListenerImpl.onLoaded(FlowExecutionList.java:178)
           at jenkins.model.Jenkins.<init>(Jenkins.java:1017)
           at hudson.model.Hudson.<init>(Hudson.java:85)
           at hudson.model.Hudson.<init>(Hudson.java:81)
           at hudson.WebAppMain$3.run(WebAppMain.java:262)

          What I can tell you is that we have been able to reproduce this issue by terminating agent machines too early.

          In this case our infrastructure killed the agent in job1 before the job finished completely and we repro the issue.

          Job2 is the following job that did not get cut off early. 

           

          Job 1
          16:23:36 [WS-CLEANUP] Deleting project workspace...
          16:23:36 [WS-CLEANUP] Deferred wipeout is used...
          16:23:36 [WS-CLEANUP] done
          [Pipeline] }
          [Pipeline] // node
          [Pipeline] }
          [Pipeline] // stage
          [Pipeline] }
          [Pipeline] // node
          [Pipeline] }
          ----------------------------
          Job 2
          16:16:42 [WS-CLEANUP] Deleting project workspace...
          16:16:42 [WS-CLEANUP] Deferred wipeout is used...
          16:16:42 [WS-CLEANUP] done
          [Pipeline] }
          [Pipeline] // node
          [Pipeline] }
          [Pipeline] // stage
          [Pipeline] }
          [Pipeline] // node
          [Pipeline] }
          [Pipeline] // parallel
          [Pipeline] }
          [Pipeline] // script
          [Pipeline] }
          [Pipeline] // stage
          [Pipeline] stage
          [Pipeline] { (Declarative: Post Actions)
          [Pipeline] node
          16:16:42 Running on a node
          [Pipeline] {
          [Pipeline] echo
          16:16:42 Reporting build status: UNSTABLE
          [Pipeline] notifyBitbucket
           notifying stuff
          [Pipeline] step
          16:16:44 Notifying Bitbucket
          16:16:44 Notified Bitbucket 
          [Pipeline] }
          [Pipeline] // node
          [Pipeline] }
          [Pipeline] // stage
          [Pipeline] }
          [Pipeline] // timestamps
          [Pipeline] }
          [Pipeline] // withEnv
          [Pipeline] End of Pipeline
          

          Hope that helps!

           

           

           

          Mark Hollingsworth added a comment -     dnusbaum  so we definitely see jenkins.model.Jenkins.<init> as part of the trace, but I'm almost 100% certain that Jenkins did not startup (I would have been paged very early in the morning if it had). What I can tell you is that we have been reproducing this issue by terminating machines.  [2020-07-24T13:47:01.622Z] Cannot contact someNode: java.lang.InterruptedException Creating placeholder flownodes because failed loading originals. java.io.IOException: Tried to load head FlowNodes for execution Owner[someFolder/someJob/buildNumber:someFolder/someJob #buildNumber] but FlowNode was not found in storage for head id:FlowNodeId 1:1714 at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution.initializeStorage(CpsFlowExecution.java:679) at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution.onLoad(CpsFlowExecution.java:716) at org.jenkinsci.plugins.workflow.job.WorkflowRun.getExecution(WorkflowRun.java:680) at org.jenkinsci.plugins.workflow.job.WorkflowRun.onLoad(WorkflowRun.java:539) at hudson.model.RunMap.retrieve(RunMap.java:225) at hudson.model.RunMap.retrieve(RunMap.java:57) at jenkins.model.lazy.AbstractLazyLoadRunMap.load(AbstractLazyLoadRunMap.java:501) at jenkins.model.lazy.AbstractLazyLoadRunMap.load(AbstractLazyLoadRunMap.java:483) at jenkins.model.lazy.AbstractLazyLoadRunMap.getByNumber(AbstractLazyLoadRunMap.java:381) at hudson.model.RunMap.getById(RunMap.java:205) at org.jenkinsci.plugins.workflow.job.WorkflowRun$Owner.run(WorkflowRun.java:929) at org.jenkinsci.plugins.workflow.job.WorkflowRun$Owner.get(WorkflowRun.java:940) at org.jenkinsci.plugins.workflow.flow.FlowExecutionList$1.computeNext(FlowExecutionList.java:65) at org.jenkinsci.plugins.workflow.flow.FlowExecutionList$1.computeNext(FlowExecutionList.java:57) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) at org.jenkinsci.plugins.workflow.flow.FlowExecutionList$ItemListenerImpl.onLoaded(FlowExecutionList.java:178) at jenkins.model.Jenkins.<init>(Jenkins.java:1017) at hudson.model.Hudson.<init>(Hudson.java:85) at hudson.model.Hudson.<init>(Hudson.java:81) at hudson.WebAppMain$3.run(WebAppMain.java:262) What I can tell you is that we have been able to reproduce this issue by terminating agent machines too early. In this case our infrastructure killed the agent in job1 before the job finished completely and we repro the issue. Job2 is the following job that did not get cut off early.    Job 1 16:23:36 [WS-CLEANUP] Deleting project workspace... 16:23:36 [WS-CLEANUP] Deferred wipeout is used... 16:23:36 [WS-CLEANUP] done [Pipeline] } [Pipeline] // node [Pipeline] } [Pipeline] // stage [Pipeline] } [Pipeline] // node [Pipeline] } ---------------------------- Job 2 16:16:42 [WS-CLEANUP] Deleting project workspace... 16:16:42 [WS-CLEANUP] Deferred wipeout is used... 16:16:42 [WS-CLEANUP] done [Pipeline] } [Pipeline] // node [Pipeline] } [Pipeline] // stage [Pipeline] } [Pipeline] // node [Pipeline] } [Pipeline] // parallel [Pipeline] } [Pipeline] // script [Pipeline] } [Pipeline] // stage [Pipeline] stage [Pipeline] { (Declarative: Post Actions) [Pipeline] node 16:16:42 Running on a node [Pipeline] { [Pipeline] echo 16:16:42 Reporting build status: UNSTABLE [Pipeline] notifyBitbucket notifying stuff [Pipeline] step 16:16:44 Notifying Bitbucket 16:16:44 Notified Bitbucket [Pipeline] } [Pipeline] // node [Pipeline] } [Pipeline] // stage [Pipeline] } [Pipeline] // timestamps [Pipeline] } [Pipeline] // withEnv [Pipeline] End of Pipeline Hope that helps!      

          Simon Sudler added a comment -

          I found the reason for the FlowNode errors on my system:

          Sep  6 00:29:20 ship kernel: [16183195.286559] Out of memory: Kill process 27501 (java) score 122 or sacrifice child
          Sep  6 00:29:20 ship kernel: [16183195.292176] Killed process 27501 (java) total-vm:21995844kB, anon-rss:5537216kB, file-rss:0kB, shmem-rss:0kB
          Sep  6 00:29:20 ship kernel: [16183196.039063] oom_reaper: reaped process 27501 (java), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
          

          And it is very consistent. I checked the last 20 occurrences, every time the flow node error occurs, the OOM-killer does his work. Why the java process on the build-node requires around 20GB of memory is unclear, because the build itself does not have this huge memory requirements (maybe some other issue).

          Since the OOM kills the java process, there is no proper feedback from the FlowNode... maybe some sanity check if the build client process should do the trick and produce some helpful error message.

          I my case, more memory helped...

          Simon Sudler added a comment - I found the reason for the FlowNode errors on my system: Sep 6 00:29:20 ship kernel: [16183195.286559] Out of memory: Kill process 27501 (java) score 122 or sacrifice child Sep 6 00:29:20 ship kernel: [16183195.292176] Killed process 27501 (java) total-vm:21995844kB, anon-rss:5537216kB, file-rss:0kB, shmem-rss:0kB Sep 6 00:29:20 ship kernel: [16183196.039063] oom_reaper: reaped process 27501 (java), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB And it is very consistent. I checked the last 20 occurrences, every time the flow node error occurs, the OOM-killer does his work. Why the java process on the build-node requires around 20GB of memory is unclear, because the build itself does not have this huge memory requirements (maybe some other issue). Since the OOM kills the java process, there is no proper feedback from the FlowNode... maybe some sanity check if the build client process should do the trick and produce some helpful error message. I my case, more memory helped...

          nishant dani added a comment - - edited

          I am able to consistently get this error. I have attached the jobs/yeticore/branches/master/builds/2/workflow-fallback/flowNodeStore.xml . My interpretation of this file could be wrong but it appears it it searching for flow node 49, but all the file has is flow nodes 51 and the 52.  This the file in workflow-fallback, the flow in workflow seems to have a single node with id 2.

          Totally blocked at this time, so willing to investigate further 

          Update -  I am running jenkins inside docker container and in Docker - Preferences - Resources - Memory , I increased memory from 2GB to 16GB (I restarted docker) and I was able to get a successful run.  Will keep monitoring.

           

           

           

           flowNodeStore.xml

          nishant dani added a comment - - edited I am able to consistently get this error. I have attached the jobs/yeticore/branches/master/builds/2/workflow-fallback/flowNodeStore.xml . My interpretation of this file could be wrong but it appears it it searching for flow node 49, but all the file has is flow nodes 51 and the 52.  This the file in workflow-fallback, the flow in workflow seems to have a single node with id 2. Totally blocked at this time, so willing to investigate further  Update -  I am running jenkins inside docker container and in Docker - Preferences - Resources - Memory , I increased memory from 2GB to 16GB (I restarted docker) and I was able to get a successful run.  Will keep monitoring.         flowNodeStore.xml

          Ulli Hafner added a comment - - edited

          On my side adding more memory to my docker container runtime worked as well.

          It would be helpful if this exception would be catched and a meaningful message would be shown to the users (low memory).

          Ulli Hafner added a comment - - edited On my side adding more memory to my docker container runtime worked as well. It would be helpful if this exception would be catched and a meaningful message would be shown to the users (low memory).

          Ramon Leon added a comment -

          Lowering the priority as the behavior is expected and the fix is to improve the log message.

          Ramon Leon added a comment - Lowering the priority as the behavior is expected and the fix is to improve the log message.

          Liam Baker added a comment -

          Hi, I am also experiencing this issue.

          2022-09-19 12:04:16 |  [GitCheckoutListener] -> No new commits found
          2022-09-19 12:04:16 |  [Pipeline] Start of Pipeline
          2022-09-19 12:04:16 |  [Pipeline] withEnv
          2022-09-19 12:04:16 |  [Pipeline] {
          2022-09-19 12:04:16 |  [Pipeline] stage
          2022-09-19 12:04:16 |  [Pipeline] { (Track (linux_x64))
          2022-09-19 12:04:17 |  Creating placeholder flownodes because failed loading originals.
          2022-09-19 12:04:17 |  [Pipeline] Start of Pipeline
          2022-09-19 12:04:17 |  [Pipeline] End of Pipeline 
          2022-09-19 12:04:17 |  [GitHub Checks] GitHub check (name: [redacted], status: completed) has been published.
          2022-09-19 12:04:18 |  
          2022-09-19 12:04:18 |  GitHub has been notified of this commit’s build result
          2022-09-19 12:04:18 |  
          2022-09-19 12:04:18 |  Finished: FAILURE

           

          • After printing the message "Finished: FAILURE", The job will actually continue and will run on a node when one with a suitable tag becomes available.
          • The build job will be available at the build URL if manually typed, however no links to the job will show in the Jenkins UI
          • Subsequent builds of this pipeline will not run until the 'broken' build is deleted using the script console.
          • The failure has occurred on many different pipelines, both custom defined and github multi-branch.
          • Occurrence rate is an average of two to four per month.

          I am currently looking for a reason/triggering condition, as this error does not correlate with high load on the Controller (CPU/RAM/DISK), errors on the host system log (journalctl/dmesg), or errors/warnings in the Jenkins System log.

          It is most common for jobs that are "replayed" or manually triggered (using the "Build Now"/"Build with parameters" button), and may be related to manually triggering multiple jobs within a short period of time (~10s).

           

          Liam Baker added a comment - Hi, I am also experiencing this issue. 2022-09-19 12:04:16 | [GitCheckoutListener] -> No new commits found 2022-09-19 12:04:16 | [Pipeline] Start of Pipeline 2022-09-19 12:04:16 | [Pipeline] withEnv 2022-09-19 12:04:16 | [Pipeline] { 2022-09-19 12:04:16 | [Pipeline] stage 2022-09-19 12:04:16 | [Pipeline] { (Track (linux_x64)) 2022-09-19 12:04:17 | Creating placeholder flownodes because failed loading originals. 2022-09-19 12:04:17 | [Pipeline] Start of Pipeline 2022-09-19 12:04:17 | [Pipeline] End of Pipeline 2022-09-19 12:04:17 | [GitHub Checks] GitHub check (name: [redacted], status: completed) has been published. 2022-09-19 12:04:18 | 2022-09-19 12:04:18 | GitHub has been notified of this commit’s build result 2022-09-19 12:04:18 | 2022-09-19 12:04:18 | Finished: FAILURE   After printing the message "Finished: FAILURE", The job will actually continue and will run on a node when one with a suitable tag becomes available. The build job will be available at the build URL if manually typed, however no links to the job will show in the Jenkins UI Subsequent builds of this pipeline will not run until the 'broken' build is deleted using the script console. The failure has occurred on many different pipelines, both custom defined and github multi-branch. Occurrence rate is an average of two to four per month. I am currently looking for a reason/triggering condition, as this error does not correlate with high load on the Controller (CPU/RAM/DISK), errors on the host system log (journalctl/dmesg), or errors/warnings in the Jenkins System log. It is most common for jobs that are "replayed" or manually triggered (using the "Build Now"/"Build with parameters" button), and may be related to manually triggering multiple jobs within a short period of time (~10s).  

            Unassigned Unassigned
            haorui658 Rui Hao
            Votes:
            45 Vote for this issue
            Watchers:
            63 Start watching this issue

              Created:
              Updated: