Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-35905

Add option to Fail the build if node label does not exist or if it cannot be provisioned within a timeout

    XMLWordPrintable

    Details

    • Type: New Feature
    • Status: Reopened (View Workflow)
    • Priority: Major
    • Resolution: Unresolved
    • Component/s: plugin-proposals
    • Labels:
      None
    • Environment:
      Jenkins 1.642
      github-branch-source-plugin:1.7
      github-organization-folder-plugin:1.3
    • Similar Issues:

      Description

      Issue

      When a node label does not exists, the build hangs in the queue until a node is assigned to that label. In that case I would like my build to fail. It would be great to have such option and also the possibility to apply it with GitHub Organization.

      How to Reproduce

      node ('thislabeldoesnotexists') {
          echo "This could be improved"
      }
      

      This label does not exists in my Instance, the build hang in the queue and the console logs show:

      > There are no nodes with the label ‘thislabeldoesnotexists’

        Attachments

          Issue Links

            Activity

            allan_burdajewicz Allan BURDAJEWICZ created issue -
            Hide
            ppitonak Pavol Pitoňák added a comment -

            Be careful that nodes might be provisioned dynamically (e.g. from AWS or OpenStack) and it takes few minutes to appear in label (if not already started).

            Show
            ppitonak Pavol Pitoňák added a comment - Be careful that nodes might be provisioned dynamically (e.g. from AWS or OpenStack) and it takes few minutes to appear in label (if not already started).
            Hide
            jglick Jesse Glick added a comment -

            Right, in some cases an agent with that label will be attached later. It is also common for the existence of a queue item to spur a Cloud to provision an agent with a matching label.

            Show
            jglick Jesse Glick added a comment - Right, in some cases an agent with that label will be attached later. It is also common for the existence of a queue item to spur a Cloud to provision an agent with a matching label.
            jglick Jesse Glick made changes -
            Field Original Value New Value
            Resolution Won't Fix [ 2 ]
            Status Open [ 1 ] Resolved [ 5 ]
            rtyler R. Tyler Croy made changes -
            Workflow JNJira [ 172525 ] JNJira + In-Review [ 199274 ]
            abayer Andrew Bayer made changes -
            Component/s pipeline-general [ 21692 ]
            abayer Andrew Bayer made changes -
            Component/s workflow-plugin [ 18820 ]
            oleg_nenashev Oleg Nenashev made changes -
            Summary Fail the build if node label does not exist. Add option to Fail the build if node label does not exist or if it cannot be provisioned within a timeout
            oleg_nenashev Oleg Nenashev made changes -
            Issue Type Improvement [ 4 ] New Feature [ 2 ]
            Hide
            oleg_nenashev Oleg Nenashev added a comment -

            I think this issues is a valid improvement request. The default behavior of node() cannot be changed of course, but it may be potentially doable via additional Pipeline closures or additional node closure parameters

            Show
            oleg_nenashev Oleg Nenashev added a comment - I think this issues is a valid improvement request. The default behavior of node() cannot be changed of course, but it may be potentially doable via additional Pipeline closures or additional node closure parameters
            oleg_nenashev Oleg Nenashev made changes -
            Assignee Jesse Glick [ jglick ]
            Resolution Won't Fix [ 2 ]
            Status Resolved [ 5 ] Reopened [ 4 ]
            jglick Jesse Glick made changes -
            Component/s github-branch-source-plugin [ 20858 ]
            Component/s pipeline [ 21692 ]
            Hide
            jglick Jesse Glick added a comment -

            No such feature exists for any other job type, and in general it is not a good idea. You could probably write a Pipeline library which implements it if you really wanted. Of course simply wrapping everything in timeout would work.

            Show
            jglick Jesse Glick added a comment - No such feature exists for any other job type, and in general it is not a good idea. You could probably write a Pipeline library which implements it if you really wanted. Of course simply wrapping everything in timeout would work.
            jglick Jesse Glick made changes -
            Resolution Won't Fix [ 2 ]
            Status Reopened [ 4 ] Resolved [ 5 ]
            oleg_nenashev Oleg Nenashev made changes -
            Component/s plugin-proposals [ 15491 ]
            Component/s durable-task-plugin [ 18622 ]
            Hide
            oleg_nenashev Oleg Nenashev added a comment -

            Since Jesse Glick disagrees, I have moved it to plugin proposals

            Show
            oleg_nenashev Oleg Nenashev added a comment - Since Jesse Glick disagrees, I have moved it to plugin proposals
            oleg_nenashev Oleg Nenashev made changes -
            Resolution Won't Fix [ 2 ]
            Status Resolved [ 5 ] Reopened [ 4 ]
            Hide
            hrmpw Patrick Wolf added a comment -

            From Jesse Glick:
            `ExecutorStepExecution.CancelledItemListener` could check `User.current` and if not null, print a message indicating what user cancelled the item, thus resulting in the build failure. Not sure it would help anything, but would at least be very easy to implement!

            Show
            hrmpw Patrick Wolf added a comment - From Jesse Glick : `ExecutorStepExecution.CancelledItemListener` could check `User.current` and if not null, print a message indicating what user cancelled the item, thus resulting in the build failure. Not sure it would help anything, but would at least be very easy to implement!
            hrmpw Patrick Wolf made changes -
            Link This issue is related to JENKINS-40466 [ JENKINS-40466 ]
            rtyler R. Tyler Croy made changes -
            Link This issue is related to JENKINS-41569 [ JENKINS-41569 ]
            cloudbees CloudBees Inc. made changes -
            Remote Link This issue links to "CloudBees Internal CJP-7183 (Web Link)" [ 19101 ]
            Hide
            alexander_kazakov Aleksandr Kazakov added a comment -

            For anyone who's struggling with this issue - we use a workaround. We use it for cloud providers but it can be adjusted and used for static slave nodes as well.

            We didn't use it in declarative pipeline, but I think this approach can be applied to it too. 

            We have a script in our shared library vars/node.groovy:

             

            import org.jenkinsci.plugins.workflow.cps.DSL
            import hudson.model.Label
            import jenkins.model.Label
            import antlr.ANTLRException
            
            @NonCPS
            String checkLabel(String labelExpression) {
                String errorMessage = ''
                def clouds = Jenkins.instance.clouds
                if (clouds) {
                    try {
                        def label = Label.parseExpression(labelExpression)
                        def labelMatchedCloud = clouds.find { it.canProvision(label) }
                        if (!labelMatchedCloud) {
                            errorMessage = "Cannot find a node with label ${labelExpression}"
                        }
                    } catch (ANTLRException e) {
                        errorMessage = "Invalid node label expression:\n${e.message}"
                    }
                }
            
                return errorMessage
            }
            
            def call(Closure body) {
                error """No label specified.
            Usage:
            node('<<you_label_1>> && <<your_label_2>>') {
                ...
            }"""
            }
            
            def call(String label, Closure body) {
                def errorMessage = checkLabel(label)
                if (errorMessage) {
                    error errorMessage
                }
                
                DSL steps = getBinding().getVariable('steps') as DSL
                steps.invokeMethod('node', [label, body] as Object[])
            }

            Jenkinsfile

             

            @Library('myFancyLibrary@version')
            
            node('label1 && label2') {
                // method from the lib is used ...
            }

             

             

             

            Show
            alexander_kazakov Aleksandr Kazakov added a comment - For anyone who's struggling with this issue - we use a workaround. We use it for cloud providers but it can be adjusted and used for static slave nodes as well. We didn't use it in declarative pipeline, but I think this approach can be applied to it too.  We have a script in our shared library vars/node.groovy:   import org.jenkinsci.plugins.workflow.cps.DSL import hudson.model.Label import jenkins.model.Label import antlr.ANTLRException @NonCPS String checkLabel( String labelExpression) { String errorMessage = '' def clouds = Jenkins.instance.clouds if (clouds) { try { def label = Label.parseExpression(labelExpression) def labelMatchedCloud = clouds.find { it.canProvision(label) } if (!labelMatchedCloud) { errorMessage = "Cannot find a node with label ${labelExpression}" } } catch (ANTLRException e) { errorMessage = "Invalid node label expression:\n${e.message}" } } return errorMessage } def call(Closure body) { error """No label specified. Usage: node( '<<you_label_1>> && <<your_label_2>>' ) { ... }""" } def call( String label, Closure body) { def errorMessage = checkLabel(label) if (errorMessage) { error errorMessage } DSL steps = getBinding().getVariable( 'steps' ) as DSL steps.invokeMethod( 'node' , [label, body] as Object []) } Jenkinsfile   @Library( 'myFancyLibrary@version' ) node( 'label1 && label2' ) { // method from the lib is used ... }      
            Hide
            asreekumar Adity Sreekumar added a comment -

            We are facing this issue currently on our builds as well on occasion as we migrate jobs from one jenkins server to another which is accompanied with associated node relabeling and addition of new nodes. What I thought made sense was if the timeout could be applied to the pipeline as a whole and not just a step.

            Show
            asreekumar Adity Sreekumar added a comment - We are facing this issue currently on our builds as well on occasion as we migrate jobs from one jenkins server to another which is accompanied with associated node relabeling and addition of new nodes. What I thought made sense was if the timeout could be applied to the pipeline as a whole and not just a step.
            amidar Amit Dar made changes -
            Assignee Amit Dar [ amidar ]
            Hide
            amidar Amit Dar added a comment -

            This option is a high valuable option for organizations using a main jenkins server handling jobs for various development and integration groups.

            this way, a job waiting for execution would be automatically removed from the build queue after a pre-defined time for machine provisioning and the built queue will stay "clean". a notification with the reason should also be viewable in the console output of the execution (though it was not executed at all...).

            Show
            amidar Amit Dar added a comment - This option is a high valuable option for organizations using a main jenkins server handling jobs for various development and integration groups. this way, a job waiting for execution would be automatically removed from the build queue after a pre-defined time for machine provisioning and the built queue will stay "clean". a notification with the reason should also be viewable in the console output of the execution (though it was not executed at all...).
            amidar Amit Dar made changes -
            Assignee Amit Dar [ amidar ]
            Hide
            marcusschulmann Marcus Schulmann added a comment -

            I wonder why this is still not yet implemented. This clogged our whole build-pipeline over the weekend because one agent was down, which was needed for specific tests.

            Show
            marcusschulmann Marcus Schulmann added a comment - I wonder why this is still not yet implemented. This clogged our whole build-pipeline over the weekend because one agent was down, which was needed for specific tests.
            Hide
            amidar Amit Dar added a comment -

            I totally agree with Marcus Schulmann, sometimes there is a uniqe agent (with specific hardware, for instance) used for tests. I wonder what can we do in order to bump this issue's importance and handling?

            Show
            amidar Amit Dar added a comment - I totally agree with Marcus Schulmann , sometimes there is a uniqe agent (with specific hardware, for instance) used for tests. I wonder what can we do in order to bump this issue's importance and handling?
            Hide
            tamerlaha ipleten added a comment -

            Timeout would be very useful.

            Show
            tamerlaha ipleten added a comment - Timeout would be very useful.
            Hide
            jasonperrone Jason Perrone added a comment -

            This is causing us serious problems as well. I thought the labels would be looked at as tags. If there is no match, no big deal, simply skip it. I mean, think about it. I can have a label defined like "prod || test || development" and if I have so much as one node matching any of those labels I'm fine. If all I had was a node with a label of prod, and no test or development nodes, no problem. But all of a sudden if I don't match on prod, it hangs in the queue? No, just skip it, like you would skip any node that doesn't match the label. This is messing up our whole paradigm because we have one set of jobs that are source controlled so that we don't have all these variants all over the place. But because of this, it doesn't work.

            Show
            jasonperrone Jason Perrone added a comment - This is causing us serious problems as well. I thought the labels would be looked at as tags. If there is no match, no big deal, simply skip it. I mean, think about it. I can have a label defined like "prod || test || development" and if I have so much as one node matching any of those labels I'm fine. If all I had was a node with a label of prod, and no test or development nodes, no problem. But all of a sudden if I don't match on prod, it hangs in the queue? No, just skip it, like you would skip any node that doesn't match the label. This is messing up our whole paradigm because we have one set of jobs that are source controlled so that we don't have all these variants all over the place. But because of this, it doesn't work.
            Hide
            tomko Tom Kostiainen added a comment -

            Totally agree that this is needed! We have the same situation as Dar mentioned. Sometimes ja job get stuck and then the queue gets full, eventually we end up with a mess. Now the recovery requires manual intervention purging the queue etc...

            Show
            tomko Tom Kostiainen added a comment - Totally agree that this is needed! We have the same situation as Dar mentioned. Sometimes ja job get stuck and then the queue gets full, eventually we end up with a mess. Now the recovery requires manual intervention purging the queue etc...

              People

              Assignee:
              Unassigned Unassigned
              Reporter:
              allan_burdajewicz Allan BURDAJEWICZ
              Votes:
              12 Vote for this issue
              Watchers:
              18 Start watching this issue

                Dates

                Created:
                Updated: