Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-58752

Timeout with activity: true and activity timed out

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Major Major
    • None
    • Jenkins ver. 2.176.2
      Basic Steps 2.18
      Job running on a Windows Slave connected with JNLP4

      My pipeline has the following options:

        options {
          buildDiscarder(logRotator(numToKeepStr: '3'))
          disableResume()
          timeout(activity: true, time: 10, unit: 'MINUTES')
        }
      

      However, there is sometimes a timeout, although every few seconds the log is extended. In the case the log says:

      Timeout set to expire after 10 min without activity
      ...
      Cancelling nested steps due to timeout
      Sending interrupt signal to process
      ...
      Timeout has been exceeded

      The timeout occurs always in the Test Stage, during a nightwatch test. The position seems not deterministic. I have 4 nightwatch tests in this stage and I got timeouts at the beginning and also at the end of this stage.

      Sometimes the pipeline works as it should and after about 16 minutes the build is completed successfully.
      I also changed the timeout time (e.g. 2,3,5,15 minutes) and got similar results.

       

          [JENKINS-58752] Timeout with activity: true and activity timed out

          Bartosz added a comment - - edited

          Do you still have this problem?
          I have the same issue (activity timeout even if logs are updated), on Jenkins 2.222.4 and Pipeline: Basic Steps 2.20.
           

          [2021-04-29T10:22:51.996Z] [Pipeline] {
          [2021-04-29T10:22:52.052Z] [Pipeline] timeout
          [2021-04-29T10:22:52.052Z] Timeout set to expire after 20 min without activity
          ...
          [2021-04-29T10:53:41.889Z] Cancelling nested steps due to timeout
          [2021-04-29T10:53:41.927Z] Installing '/var/tmp/pkgsrc/work/packages/All/tex-merriweather-2014.tgz' ....Sending interrupt signal to process
          [2021-04-29T10:53:48.094Z] Terminated
          [2021-04-29T10:53:48.105Z] script returned exit code 143
          [2021-04-29T10:53:48.140Z] [Pipeline] }
          [2021-04-29T10:53:48.204Z] [Pipeline] // script
          [2021-04-29T10:53:48.266Z] [Pipeline] }
          ...
          [2021-04-29T11:00:10.801Z] [Checks API] No suitable checks publisher found.
          [2021-04-29T11:00:10.838Z] Timeout has been exceeded
          [2021-04-29T11:00:10.838Z] Finished: ABORTED
          

          I'm wondering if upgrade Jenkins (or Basic Steps plugin) to latest version would resolve that issue.

          How I could debug this issue (get Jenkins logs from run)?

          Activity feature was introduced with ticket:
          https://issues.jenkins.io/browse/JENKINS-26521
          and Pull Request:
          https://github.com/jenkinsci/workflow-basic-steps-plugin/pull/62/files
           

          Bartosz added a comment - - edited Do you still have this problem? I have the same issue (activity timeout even if logs are updated), on Jenkins 2.222.4 and  Pipeline: Basic Steps   2.20 .   [2021-04-29T10:22:51.996Z] [Pipeline] { [2021-04-29T10:22:52.052Z] [Pipeline] timeout [2021-04-29T10:22:52.052Z] Timeout set to expire after 20 min without activity ... [2021-04-29T10:53:41.889Z] Cancelling nested steps due to timeout [2021-04-29T10:53:41.927Z] Installing '/ var /tmp/pkgsrc/work/packages/All/tex-merriweather-2014.tgz' ....Sending interrupt signal to process [2021-04-29T10:53:48.094Z] Terminated [2021-04-29T10:53:48.105Z] script returned exit code 143 [2021-04-29T10:53:48.140Z] [Pipeline] } [2021-04-29T10:53:48.204Z] [Pipeline] // script [2021-04-29T10:53:48.266Z] [Pipeline] } ... [2021-04-29T11:00:10.801Z] [Checks API] No suitable checks publisher found. [2021-04-29T11:00:10.838Z] Timeout has been exceeded [2021-04-29T11:00:10.838Z] Finished: ABORTED I'm wondering if upgrade Jenkins (or Basic Steps plugin) to latest version would resolve that issue. How I could debug this issue (get Jenkins logs from run)? Activity feature was introduced with ticket: https://issues.jenkins.io/browse/JENKINS-26521 and Pull Request: https://github.com/jenkinsci/workflow-basic-steps-plugin/pull/62/files  

          Michael Korn added a comment -

          I can't say what the current status is. We no longer use Jenkins for the most projects.

          Michael Korn added a comment - I can't say what the current status is. We no longer use Jenkins for the most projects.

          Bartosz added a comment - - edited

          In our setup case, we are running stages in parallel (see attached screenshot), with mixed Ubuntu systems (which are using Docker) and native macOS builds.

          mikorn How your configuration looked like?

          Bartosz added a comment - - edited In our setup case, we are running stages in parallel (see attached screenshot), with mixed Ubuntu systems (which are using Docker) and native macOS builds. mikorn How your configuration looked like?

          Mario added a comment - - edited

          Can confirm issue: (happens randomly)

          using scripted pipelines
          Jenkins 2.235.5
          and Pipeline: Basic Steps   2.20

          09:02:11 Timeout set to expire after 4 hr 0 min without activity
          .....
          13:38:04 Scheduling project: ......
          13:50:17 Starting building: .......
          15:07:34 Cancelling nested steps due to timeout
          15:08:34 Body did not finish within grace period; terminating with extreme prejudice
          15:08:35 Error occurred

          Mario added a comment - - edited Can confirm issue: (happens randomly) using scripted pipelines Jenkins 2.235.5 and  Pipeline: Basic Steps   2.20 09:02:11 Timeout set to expire after 4 hr 0 min without activity ..... 13:38:04 Scheduling project: ...... 13:50:17 Starting building: ....... 15:07:34 Cancelling nested steps due to timeout 15:08:34 Body did not finish within grace period; terminating with extreme prejudice 15:08:35 Error occurred

          Bartosz added a comment - - edited

          Unfortunately I don't have enough logs, but it could happen that if flush is hangs, then Tick will not be triggered.
          To resolve that you can try move Tick from line 312:
          https://github.com/jenkinsci/workflow-basic-steps-plugin/blob/master/src/main/java/org/jenkinsci/plugins/workflow/steps/TimeoutStepExecution.java#L312

          new Tick(active, new WeakReference<>(decorated), timeout, channel, id).schedule();
          

          to new line after 292:
          https://github.com/jenkinsci/workflow-basic-steps-plugin/blob/master/src/main/java/org/jenkinsci/plugins/workflow/steps/TimeoutStepExecution.java#L292

                      AtomicBoolean active = new AtomicBoolean();
                      new Tick(active, new WeakReference<>(decorated), timeout, channel, id).schedule();
                      OutputStream decorated = new LineTransformationOutputStream() {
          

          Bartosz added a comment - - edited Unfortunately I don't have enough logs, but it could happen that if flush is hangs, then Tick will not be triggered. To resolve that you can try move Tick from line 312: https://github.com/jenkinsci/workflow-basic-steps-plugin/blob/master/src/main/java/org/jenkinsci/plugins/workflow/steps/TimeoutStepExecution.java#L312 new Tick(active, new WeakReference<>(decorated), timeout, channel, id).schedule(); to new line after 292: https://github.com/jenkinsci/workflow-basic-steps-plugin/blob/master/src/main/java/org/jenkinsci/plugins/workflow/steps/TimeoutStepExecution.java#L292 AtomicBoolean active = new AtomicBoolean(); new Tick(active, new WeakReference<>(decorated), timeout, channel, id).schedule(); OutputStream decorated = new LineTransformationOutputStream() {

          Adam Gabryś added a comment - - edited

          We hit the same problem on our servers. I analyzed the source code and I think that when a lot of activity timeouts are used at the same time, some "killer" objects are lost and not cancelled as should be. It is the reason why the builds are stopped even when new logs are printed.

          I created a PR with our implementation - jenkinsci/workflow-basic-steps-plugin PR-192. You may read more details there. I also attached a patched plugin version. Feel free to install it and test on your server.

          Adam Gabryś added a comment - - edited We hit the same problem on our servers. I analyzed the source code and I think that when a lot of activity timeouts are used at the same time, some "killer" objects are lost and not cancelled as should be. It is the reason why the builds are stopped even when new logs are printed. I created a PR with our implementation - jenkinsci/workflow-basic-steps-plugin PR-192 . You may read more details there. I also attached a patched plugin version. Feel free to install it and test on your server.

          Daniel Steiert added a comment - - edited

          The colleague that originally created the PR left the department. I will take the above mentioned PR over and had to recreate it here: jenkinsci/worfklow-basic-steps-plugin PR-231. It has been rebased with the latest master branch.

          Daniel Steiert added a comment - - edited The colleague that originally created the PR left the department. I will take the above mentioned PR over and had to recreate it here: jenkinsci/worfklow-basic-steps-plugin PR-231 . It has been rebased with the latest master branch.

          Steven added a comment -

          We're also seeing random timeouts trigger even though there is only one timeout in operation at the time and logs are being update constantly.

          Steven added a comment - We're also seeing random timeouts trigger even though there is only one timeout in operation at the time and logs are being update constantly.

            Unassigned Unassigned
            mikorn Michael Korn
            Votes:
            7 Vote for this issue
            Watchers:
            16 Start watching this issue

              Created:
              Updated: