• Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Blocker Blocker
    • pipeline
    • None

      The issue occours if you have a "Multibranch Pipeline" job that takes some time such as:

      pipeline {
          agent any
          
          stages {
               stage('only'){
                  steps {
                      checkout scm
                      sh 'sleep 300'
                  }
              }
          }
      }

          
      This gets automatically detected by the pipeline and executed. This can also be kicked off manually by going into the job and clicking the run button beside the branch (e.g. 'master'). If this button is pushed twice in succession (or two branches are committed at the same time) two instances of the job will run. If there is a single build executor that these jobs can run on (I have only seen this when the executor is separate from the "master" node), the first will start running. The second will do some pipeline action to identify what needs to be run, but then wait in the queue for the first job to complete. Once the first job has completed the second job will run to completion. For the above job, the first job will show as taking the appropriate five minutes. However, the second job will show as taking ten minutes....it is including the time it was waiting for an executor! If this is done with many jobs, the time will include all the time that each job had to wait in the queue. When looking at the historical builds, it will look like it took ten minutes to execute even though it was five of waiting and five of executing. This will also affect the projected build time for the next job run.

      This also shows "10 minutes building on an executor" if the metrics plugin is installed.

      The solution to this is to not include the time spent waiting for an executor in the build time recorded. I believe the matrix jobs had a similar issue that was fixed in Issue #8112.

          [JENKINS-46569] Job execution time includes waiting time

          Marty S added a comment -

          The priority of this should be higher than "Minor", since the additional waiting time will count towards possible timeouts.

          So if I define a timeout of one hour and the job waits for 30 minutes, it will be cancelled after 30 minutes of "real" execution time.

          Marty S added a comment - The priority of this should be higher than "Minor", since the additional waiting time will count towards possible timeouts. So if I define a timeout of one hour and the job waits for 30 minutes, it will be cancelled after 30 minutes of "real" execution time.

          Josh Wand added a comment -

          I partially get around this by putting the timeout block inside the node block (using procedural pipeline, anyways):

          stage('stage 1') {
            node {
              timeout(30) {
                // do stuff
              }
            }
          }

          But the build times are still wrong, even for a single stage–the total time reported still includes the time spent waiting for an executor. 

          Josh Wand added a comment - I partially get around this by putting the timeout block inside the node block (using procedural pipeline, anyways): stage( 'stage 1' ) { node { timeout(30) { // do stuff } } } But the build times are still wrong, even for a single stage–the total time reported still includes the time spent waiting for an executor. 

          joshwand the problem with that workaround is that each stage/node needs its own timeout. :-/

          Aaron D. Marasco added a comment - joshwand the problem with that workaround is that each stage/node needs its own timeout . :-/

          David Resnick added a comment -

          I agree, this is definitely not minor. Big problem for us, a shame that it is not higher priority.

          David Resnick added a comment - I agree, this is definitely not minor. Big problem for us, a shame that it is not higher priority.

          Random Dev added a comment -

          is there still no workaround for this?
          I agree this should be higher prio.
          When using multibranch pipelines it risks rendering the timeout option useless

          Random Dev added a comment - is there still no workaround for this? I agree this should be higher prio. When using multibranch pipelines it risks rendering the timeout option useless

          Itai added a comment -

          Also encountered this problem. I think there should be a way to reset the timeout, or split it into two distinct timeouts: one for waiting for an agent, and another for once the job started running.

          Itai added a comment - Also encountered this problem. I think there should be a way to reset the timeout, or split it into two distinct timeouts: one for waiting for an agent, and another for once the job started running.

          Roy added a comment - - edited

          It is ridiculous that the total execution time includes the waiting time in the queue. This also makes it pretty useless to define a timeout on a Declarative Pipeline, cause there is no workaround for it... I adjusted the priority. Everyone keeps saying that it should be adjusted but no one did. Hopefully the developers agree with the priority change, and will have time to fix this bug. 

          Roy added a comment - - edited It is ridiculous that the total execution time includes the waiting time in the queue. This also makes it pretty useless to define a timeout on a Declarative Pipeline, cause there is no workaround for it... I adjusted the priority. Everyone keeps saying that it should be adjusted but no one did. Hopefully the developers agree with the priority change, and will have time to fix this bug. 

          domingo added a comment -

          Hi!, How is this theme going, im extracting data from some pipelines and need to separate execution time from queue/waiting time for executor of each stage. But with both times mixed i cannot make the analytics well, is there a part of the codebase that measure exactly this?, It is done in the core jenkins repo or in the metrics plugin?. 

          If any updated or hints of how the problem could be solved i will be glad!.

          domingo added a comment - Hi!, How is this theme going, im extracting data from some pipelines and need to separate execution time from queue/waiting time for executor of each stage. But with both times mixed i cannot make the analytics well, is there a part of the codebase that measure exactly this?, It is done in the core jenkins repo or in the metrics plugin?.  If any updated or hints of how the problem could be solved i will be glad!.

          Roy added a comment -

          I found a workaround for the timeout problem. Right now we define the timeout within the job, or in a stage, itself. The timeout 'timer' will start at the start of executing the build instead of when the metadata is parsed and when it can end up in the queue. It is not very pretty, but it works. And way better. When a timeout is defined in the options of the pipeline it will be canceled, and not reach the finally or post. We do some important cleanup of workspace, but also docker containers. They were not killed and cleaned when a time out was reached when the timeout was defined in the 'options' of the job.

          def timeout_mins = 1;
          
          pipeline {
              agent any     
              options {
                  buildDiscarder(logRotator(numToKeepStr: '10'))
              }     
              stages { 
                  stage('set timeout'){ 
                      options { timeout(time: timeout_mins, unit: 'MINUTES') }
                      stages{
                          stage('Hello1') {
                              steps{
                                  echo 'start sleep 1'
                                  sleep(30)
                                  echo 'end sleep1'
                              }
                          }
                          stage('Hello2') {
                              steps{
                                  echo 'start sleep2'
                                  sleep(20)
                                  echo 'end sleep2'
                              }
                          }
                      }
                  }
              }
              post {
                  always {
                      echo 'Post is reached'
                  }
              }
          }
           

          Roy added a comment - I found a workaround for the timeout problem. Right now we define the timeout within the job, or in a stage, itself. The timeout 'timer' will start at the start of executing the build instead of when the metadata is parsed and when it can end up in the queue. It is not very pretty, but it works. And way better. When a timeout is defined in the options of the pipeline it will be canceled, and not reach the finally or post. We do some important cleanup of workspace, but also docker containers. They were not killed and cleaned when a time out was reached when the timeout was defined in the 'options' of the job. def timeout_mins = 1; pipeline {     agent any     options {         buildDiscarder(logRotator(numToKeepStr: '10' ))     }     stages {          stage( 'set timeout' ){              options { timeout(time: timeout_mins, unit: 'MINUTES' ) }             stages{                 stage( 'Hello1' ) {                     steps{                         echo 'start sleep 1'                         sleep(30)                         echo 'end sleep1'                     }                 }                 stage( 'Hello2' ) {                     steps{                         echo 'start sleep2'                         sleep(20)                         echo 'end sleep2'                     }                 }             }         }     }     post {         always {             echo 'Post is reached'         }     } }

          Margaret added a comment -

          Just getting started with moving some jobs to use Pipeline and honestly cannot believe this behaviour. How has this gone unfixed for 7 years?

          Margaret added a comment - Just getting started with moving some jobs to use Pipeline and honestly cannot believe this behaviour. How has this gone unfixed for 7 years?

            Unassigned Unassigned
            teeks99 Thomas Kent
            Votes:
            60 Vote for this issue
            Watchers:
            61 Start watching this issue

              Created:
              Updated: