Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-58878

withCredentials hangs

    XMLWordPrintable

    Details

    • Similar Issues:
    • Released As:
      workflow-cps 2.76

      Description

      On Jenkins 2.176.2 and credentials-binding-plugin 1.19, the following pipeline will intermittently hang:

      pipeline {
          agent any
      
          stages {
              stage('bug') {
                  steps {
                      script {
                          for (int i = 0; i < 100; i++) {
                              withEnv(['A=b']) {
                                  withCredentials([]) {
                                      sh "echo hello ${i}"
                                  }
                              }
                          }
                      }
                  }
              }
          }
      }
      

      The build log looks like this:

      ...
      [Pipeline] withEnv
      [Pipeline] {
      [Pipeline] withCredentials
      

      until the build is aborted.

      I've seen two independent things that both seem to work around the problem:

      • flipping so `withCredentials([])` wraps `withEnv([...])` seems to avoid the hang.
      • ensuring there's always at least one item in the `withCredentials` list parameter seems to avoid the hang.

      We run into this in a pipeline where the parameter to withCredentials varies by projects, and sometimes is an empty list.

        Attachments

          Issue Links

            Activity

            Hide
            mwkaufman Michael Kaufman added a comment -

            We have the same issue in our environment with credentials-binding-plugin 1.20 and Jenkins 2.186. We flipped withCredentials and withEnv but that didn't stop it.

            In the thread dump:
            Thread #166
            at DSL.withCredentials(not currently scheduled, or running blocks)

            Show
            mwkaufman Michael Kaufman added a comment - We have the same issue in our environment with credentials-binding-plugin 1.20 and Jenkins 2.186. We flipped withCredentials and withEnv but that didn't stop it. In the thread dump: Thread #166 at DSL.withCredentials(not currently scheduled, or running blocks)
            Hide
            jons Jon Sten added a comment - - edited

            We've got hit by this one as well after upgrading core and a bunch of plugins, most notably we upgraded:

            Jenkins core: 2.176.2 => 2.190.1
            pipeline-build-step:2.7 => 2.9
            workflow-cps:2.71 => 2.74
            workflow-durable-task-step:2.32 => 2.34
            workflow-job:2.33 => 2.35

            credentials-binding-plugin was not updated and we are running 1.18 (for some reason we have any explicit inclusion of this plugin, instead it is included as dependency to other plugins).

            I've been able to reproduce it with the following code: 

            node {
                for (int i = 0; i < 10000; i++) {
                    withCredentials([]) {
                        echo "At ${i}"
                    }
                }
            }
            

            and it doesn't seem that this is bug is restricted to only credentials step, e.g. the following also fails for us:

            node {
                for (int i = 0; i < 10000; i++) {
                    wrap([$class: 'BuildUser']) {
                        echo "At ${i}"
                    }
                }
            }

            Rather it seems like there is some sort of concurrency problem related to GeneralNonBlockingStepExecution which is used by both steps above. This class is located in workflow-step-api (which wasn't updated in our case, version 2.20 before and after). So my best bet is that the problem lies in workflow-cps. I would suggest changing the component to that plugin (and try to get attention from one of the plugin maintainers).

            During debugging of this problem I also managed to get a stacktrace during execution of the following:

            node {
                for (int i = 0; i < 10000; i++) {
                    withEnv(['A=b']) {
                        withCredentials([]) {
                            sh "echo hello ${i}"
                        }
                    }
                }
            }
            

            Console output with stacktrace:

            ...
            [Pipeline] {
            [Pipeline] sh
            + echo hello 34
            hello 34
            [Pipeline] }
            [Pipeline] // withCredentials
            [Pipeline] }
            [Pipeline] // withEnv
            [Pipeline] withEnv
            [Pipeline] {
            [Pipeline] withCredentials
            [Pipeline] End of Pipeline
            java.lang.ArrayIndexOutOfBoundsException: 0
            	at org.jenkinsci.plugins.workflow.cps.DSL$ThreadTaskImpl.invokeBody(DSL.java:647)
            	at org.jenkinsci.plugins.workflow.cps.DSL$ThreadTaskImpl.eval(DSL.java:615)
            	at org.jenkinsci.plugins.workflow.cps.CpsThread.runNextChunk(CpsThread.java:196)
            	at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.run(CpsThreadGroup.java:370)
            	at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.access$200(CpsThreadGroup.java:93)
            	at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:282)
            	at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:270)
            	at org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService$2.call(CpsVmExecutorService.java:66)
            	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
            	at hudson.remoting.SingleLaneExecutorService$1.run(SingleLaneExecutorService.java:131)
            	at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
            	at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:59)
            	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
            	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
            	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
            	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
            	at java.lang.Thread.run(Thread.java:748)
            Finished: FAILURE

            This looks very related as it occurred during start of withCredentials step, and looking at the code in DSL.java it is obviously some sort of race-condition problem.

             

            Show
            jons Jon Sten added a comment - - edited We've got hit by this one as well after upgrading core and a bunch of plugins, most notably we upgraded: Jenkins core: 2.176.2 => 2.190.1 pipeline-build-step:2.7 => 2.9 workflow-cps:2.71 => 2.74 workflow-durable-task-step:2.32 => 2.34 workflow-job:2.33 => 2.35 credentials-binding-plugin was not updated and we are running 1.18 (for some reason we have any explicit inclusion of this plugin, instead it is included as dependency to other plugins). I've been able to reproduce it with the following code:  node { for ( int i = 0; i < 10000; i++) { withCredentials([]) { echo "At ${i}" } } } and it doesn't seem that this is bug is restricted to only credentials step, e.g. the following also fails for us: node { for ( int i = 0; i < 10000; i++) { wrap([$class: 'BuildUser' ]) { echo "At ${i}" } } } Rather it seems like there is some sort of concurrency problem related to GeneralNonBlockingStepExecution which is used by both steps above. This class is located in workflow-step-api (which wasn't updated in our case, version 2.20 before and after). So my best bet is that the problem lies in workflow-cps. I would suggest changing the component to that plugin (and try to get attention from one of the plugin maintainers). During debugging of this problem I also managed to get a stacktrace during execution of the following: node { for ( int i = 0; i < 10000; i++) { withEnv([ 'A=b' ]) { withCredentials([]) { sh "echo hello ${i}" } } } } Console output with stacktrace: ... [Pipeline] { [Pipeline] sh + echo hello 34 hello 34 [Pipeline] } [Pipeline] // withCredentials [Pipeline] } [Pipeline] // withEnv [Pipeline] withEnv [Pipeline] { [Pipeline] withCredentials [Pipeline] End of Pipeline java.lang.ArrayIndexOutOfBoundsException: 0 at org.jenkinsci.plugins.workflow.cps.DSL$ThreadTaskImpl.invokeBody(DSL.java:647) at org.jenkinsci.plugins.workflow.cps.DSL$ThreadTaskImpl.eval(DSL.java:615) at org.jenkinsci.plugins.workflow.cps.CpsThread.runNextChunk(CpsThread.java:196) at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.run(CpsThreadGroup.java:370) at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.access$200(CpsThreadGroup.java:93) at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:282) at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:270) at org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService$2.call(CpsVmExecutorService.java:66) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at hudson.remoting.SingleLaneExecutorService$1.run(SingleLaneExecutorService.java:131) at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28) at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:59) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang. Thread .run( Thread .java:748) Finished: FAILURE This looks very related as it occurred during start of withCredentials step, and looking at the code in DSL.java it is obviously some sort of race-condition problem.  
            Hide
            jonathanb1 Jonathan B added a comment -

            Thanks Jon Sten for the additional report and debugging. I've added the workflow-cps-plugin component to the ticket.

            Show
            jonathanb1 Jonathan B added a comment - Thanks Jon Sten for the additional report and debugging. I've added the workflow-cps-plugin component to the ticket.
            Hide
            dnusbaum Devin Nusbaum added a comment -

            Ok, as suspected by Jon Sten (thanks you for the detailed debugging information!), I think the issue affects all steps that use GeneralizedNonBlockingStepExecution. If such a step starts relatively quickly (as in the case of withCredentials where there are no credentials and thus not much work to do), then sometimes the body invokers from the step will be added from a background thread when the step has not yet switched to asynchronous mode, but after any body invokers that were added to the step synchronously have already been invoked, leaving the body invokers added from the background thread in a state where they will never be invoked, causing the hang. For anyone seeing the problem, here is an experimental release of 2.76 that includes this PR that you can use to check if it fixes the problems you are seeing.

            Show
            dnusbaum Devin Nusbaum added a comment - Ok, as suspected by Jon Sten (thanks you for the detailed debugging information!), I think the issue affects all steps that use GeneralizedNonBlockingStepExecution . If such a step starts relatively quickly (as in the case of withCredentials where there are no credentials and thus not much work to do), then sometimes the body invokers from the step will be added from a background thread when the step has not yet switched to asynchronous mode, but after any body invokers that were added to the step synchronously have already been invoked, leaving the body invokers added from the background thread in a state where they will never be invoked, causing the hang. For anyone seeing the problem, here is an experimental release of 2.76 that includes this PR that you can use to check if it fixes the problems you are seeing.
            Hide
            jons Jon Sten added a comment -

            You're welcome, and thank you Devin Nusbaum for fixing the problem! I started to try to fix the bug, but became very fast confused and disoriented by all the different threads and objects involved, so I gave up . In other words, I appreciate experts like you that can fix these problems!

            I've given the linked experimental release a spin on our staging environment and I'm no longer able to reproduce the issue

            Show
            jons Jon Sten added a comment - You're welcome, and thank you Devin Nusbaum for fixing the problem! I started to try to fix the bug, but became very fast confused and disoriented by all the different threads and objects involved, so I gave up . In other words, I appreciate experts like you that can fix these problems! I've given the linked experimental release a spin on our staging environment and I'm no longer able to reproduce the issue
            Hide
            dnusbaum Devin Nusbaum added a comment -

            A fix for this issue was just released in Pipeline: Groovy Plugin version 2.76.

            Show
            dnusbaum Devin Nusbaum added a comment - A fix for this issue was just released in Pipeline: Groovy Plugin version 2.76.
            Hide
            reinholdfuereder Reinhold Füreder added a comment -

            Devin Nusbaum By any chance could this fix also solve my sporadically experienced hanging pipelines in withDockerRegistry/withDockerServer context (with executor on Jenkins master *cough*)?

            Example thread dump:

                at DSL.withDockerRegistry(not currently scheduled, or running blocks)
                at org.jenkinsci.plugins.docker.workflow.Docker.withRegistry(Docker.groovy:40)
                at DSL.withEnv(Native Method)
                at org.jenkinsci.plugins.docker.workflow.Docker.withRegistry(Docker.groovy:39)
                at org.jenkinsci.plugins.docker.workflow.Docker.node(Docker.groovy:66)
                at org.jenkinsci.plugins.docker.workflow.Docker.withRegistry(Docker.groovy:38)
                at com.acme.php.phpstorm.OfflineInspection.execute(OfflineInspection.groovy:59)
                at acme.executePhpStormInspections(acme.groovy:256)
                at WorkflowScript.run(WorkflowScript:101)
            ...
            

            Please note, however, that it appeared as if the single or first (?) step inside such withDockerRegistry/withDockerServer contexts may sometimes actually be still executed...

            Show
            reinholdfuereder Reinhold Füreder added a comment - Devin Nusbaum By any chance could this fix also solve my sporadically experienced hanging pipelines in withDockerRegistry / withDockerServer context (with executor on Jenkins master *cough*) ? Example thread dump: at DSL.withDockerRegistry(not currently scheduled, or running blocks) at org.jenkinsci.plugins.docker.workflow.Docker.withRegistry(Docker.groovy:40) at DSL.withEnv(Native Method) at org.jenkinsci.plugins.docker.workflow.Docker.withRegistry(Docker.groovy:39) at org.jenkinsci.plugins.docker.workflow.Docker.node(Docker.groovy:66) at org.jenkinsci.plugins.docker.workflow.Docker.withRegistry(Docker.groovy:38) at com.acme.php.phpstorm.OfflineInspection.execute(OfflineInspection.groovy:59) at acme.executePhpStormInspections(acme.groovy:256) at WorkflowScript.run(WorkflowScript:101) ... Please note, however, that it appeared as if the single or first (?) step inside such withDockerRegistry / withDockerServer contexts may sometimes actually be still executed...
            Hide
            dnusbaum Devin Nusbaum added a comment -

            Reinhold Füreder Yes, if you were running Docker Pipeline 1.19 or newer, in which case those steps use GeneralizedNonBlockingStepExecution (see docker-workflow PR 158), then they could have been affected by the bug and should be fixed now.

            Show
            dnusbaum Devin Nusbaum added a comment - Reinhold Füreder Yes, if you were running Docker Pipeline 1.19 or newer, in which case those steps use GeneralizedNonBlockingStepExecution (see docker-workflow PR 158 ), then they could have been affected by the bug and should be fixed now.
            Hide
            reinholdfuereder Reinhold Füreder added a comment -

            Thanks, this is very promising/good news!

            Show
            reinholdfuereder Reinhold Füreder added a comment - Thanks, this is very promising/good news!

              People

              Assignee:
              dnusbaum Devin Nusbaum
              Reporter:
              jonathanb1 Jonathan B
              Votes:
              4 Vote for this issue
              Watchers:
              7 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: