It would be desirable to have a standard mechanism for testing Pipeline scripts without running them on a production server. There are two competing suggestions:

      Mock framework

      Inspired by Job DSL (example).

      We could set up a GroovyShell in which step functions and global variables were predefined as mocks (in a fashion similar to Powermock, but easier in Groovy given its dynamic nature), and stub out the expected return value / exception for each, with some standard predefinitions such as for currentBuild.

      Ideally the shell would be CPS-transformed, with the program state serialized and then reloaded between every continuation (though this might involve a lot of code duplication with workflow-cps).

      Should be easy to pass it through the Groovy sandbox (if requested), though the live Whitelist.all from Jenkins would be unavailable, so we would be limited to known static whitelists, the @Whitelisted annotation, and perhaps some custom additions.

      Quick and flexible, but low fidelity to real behavior.

      JenkinsRule-style

      Use an embedded Jenkins server, as per JenkinsRule in jenkins-test-harness, and actually create a WorkflowJob with the specified definition. Can use for example mock-slave to create nodes.

      Need to have a "dry-run" flag so that attempts to do things like deploy artifacts or send email do not really take action. This could perhaps be a general API in Jenkins core, as it would be useful also for test instances (shadows of production servers), acceptance-test-harness, etc.

      Slower to run (seconds per test case rather than milliseconds), and trickier to set up, but much more realistic coverage. The tests for Pipeline (and Pipeline steps) themselves use this technique.

          [JENKINS-33925] Test framework for Jenkinsfile

          Jesse Glick created issue -
          Jesse Glick made changes -
          Description Original: It would be desirable to have a standard mechanism for testing Pipeline scripts without running them on a production server. There are two competing suggestions:

          h3. Mock framework

          Inspired by Job DSL ([example|https://github.com/sheehan/job-dsl-gradle-example/blob/e240056da6691bf0a2fdc99e5aab33bc49e42b2f/src/test/groovy/com/dslexample/GrailsCiJobBuilderSpec.groovy#L34-L49]).

          We could set up a {{GroovyShell}} in which step functions and global variables were predefined as mocks (in a fashion similar to Powermock, but easier in Groovy given its dynamic nature), and stub out the expected return value / exception for each, with some standard predefinitions such as for {{currentBuild}}.

          Ideally the shell would be CPS-transformed, with the program state serialized and then reloaded between every continuation (though this might involve a lot of code duplication with {{workflow-cps}}).

          Should be easy to pass it through the Groovy sandbox (if requested), though the live {{Whitelist.all}} from Jenkins would be unavailable, so we would be limited to known static whitelists, the {{@Whitelisted}} annotation, and perhaps some custom additions.

          Quick and flexible, but low fidelity to real behavior.

          h3. {{JenkinsRule}}-style

          Use an embedded Jenkins server, as per {{JenkinsRule}} in {{jenkins-test-harness}}, and actually create a {{WorkflowJob}} with the specified definition. Can use for example {{mock-slave}} to create nodes.

          Need to have a "dry-run" flag so that attempts to do things like deploy artifacts or send email do not really take action. This could perhaps be a general API in Jenkins core, as it would be useful also for test instances (shadows of production servers), {{acceptance-test-harness}}, etc.

          Slower to run (seconds per test case rather than milliseconds), and trickier to set up, but much more realistic coverage.
          New: It would be desirable to have a standard mechanism for testing Pipeline scripts without running them on a production server. There are two competing suggestions:

          h3. Mock framework

          Inspired by Job DSL ([example|https://github.com/sheehan/job-dsl-gradle-example/blob/e240056da6691bf0a2fdc99e5aab33bc49e42b2f/src/test/groovy/com/dslexample/GrailsCiJobBuilderSpec.groovy#L34-L49]).

          We could set up a {{GroovyShell}} in which step functions and global variables were predefined as mocks (in a fashion similar to Powermock, but easier in Groovy given its dynamic nature), and stub out the expected return value / exception for each, with some standard predefinitions such as for {{currentBuild}}.

          Ideally the shell would be CPS-transformed, with the program state serialized and then reloaded between every continuation (though this might involve a lot of code duplication with {{workflow-cps}}).

          Should be easy to pass it through the Groovy sandbox (if requested), though the live {{Whitelist.all}} from Jenkins would be unavailable, so we would be limited to known static whitelists, the {{@Whitelisted}} annotation, and perhaps some custom additions.

          Quick and flexible, but low fidelity to real behavior.

          h3. {{JenkinsRule}}-style

          Use an embedded Jenkins server, as per {{JenkinsRule}} in {{jenkins-test-harness}}, and actually create a {{WorkflowJob}} with the specified definition. Can use for example {{mock-slave}} to create nodes.

          Need to have a "dry-run" flag so that attempts to do things like deploy artifacts or send email do not really take action. This could perhaps be a general API in Jenkins core, as it would be useful also for test instances (shadows of production servers), {{acceptance-test-harness}}, etc.

          Slower to run (seconds per test case rather than milliseconds), and trickier to set up, but much more realistic coverage. The tests for Pipeline (and Pipeline steps) themselves use this technique.

          Andrew Bayer added a comment -

          I'm strongly +1 on the JenkinsRule approach. I think the challenge really entirely lies in the "dry-run" matter - how do we determine what behaviors are the ones we should mock/skip/whatever we do? How do we mock/skip/whatever those behaviors? How is that extensible and easy for plugins to support?

          Andrew Bayer added a comment - I'm strongly +1 on the JenkinsRule approach. I think the challenge really entirely lies in the "dry-run" matter - how do we determine what behaviors are the ones we should mock/skip/whatever we do? How do we mock/skip/whatever those behaviors? How is that extensible and easy for plugins to support?
          Patrick Wolf made changes -
          Labels Original: testing New: 3.0 testing
          Patrick Wolf made changes -
          Labels Original: 3.0 testing New: followup testing

          Andrew Bayer added a comment -

          I've been thinking about this a bit more - some thoughts on what I'd like to see in the end result:

          • An easy way to run a test - I think my ideal would be to have a job type that was basically identical to the existing Pipeline job type (with both inline and from-SCM options) that would run the Pipeline script in the test environment.
          • Guaranteeing no side effects - this is the hairy one, obviously. I'm not sure how we'd be able to make sure everything that could have a side effect doesn't - i.e., would we need to interfere and override sh and bat? And def foo = docker.build ... ; foo.push ...? And who knows what else?
          • Going along with that - how do we deal with things like testing if errors happened, or steps that depend on generated output from previous steps that we may have mocked out/overridden?

          Andrew Bayer added a comment - I've been thinking about this a bit more - some thoughts on what I'd like to see in the end result: An easy way to run a test - I think my ideal would be to have a job type that was basically identical to the existing Pipeline job type (with both inline and from-SCM options) that would run the Pipeline script in the test environment. Guaranteeing no side effects - this is the hairy one, obviously. I'm not sure how we'd be able to make sure everything that could have a side effect doesn't - i.e., would we need to interfere and override sh and bat ? And def foo = docker.build ... ; foo.push ... ? And who knows what else? Going along with that - how do we deal with things like testing if errors happened, or steps that depend on generated output from previous steps that we may have mocked out/overridden?

          Jesse Glick added a comment -

          a job type that was basically identical to the existing Pipeline job type (with both inline and from-SCM options) that would run the Pipeline script in the test environment

          Hmm, this would be something very different from the JenkinsRule proposal. I do not think this kind of setup would fly. It would be fine for interactive testing but it would not work well for automated testing.

          […] steps that depend on generated output from previous steps that we may have mocked out/overridden

          Perhaps the mocking facility could be used in the JenkinsRule scenario, as a kind of hybrid approach. So you could declare that, for example, all mail steps, or the third node step, or any sh step taking the script argument mvn clean test, etc., should print the following output, or return the following result object, or fail with the following error message. Unmentioned steps would run “for real” by default, or you could declare that by default these would fail the test, but you could explicitly ask for some to follow the real behavior.

          Some of these facilities would actually be useful in Pipeline plugin tests, too, which would help validate the concept and iron out bugs.

          Jesse Glick added a comment - a job type that was basically identical to the existing Pipeline job type (with both inline and from-SCM options) that would run the Pipeline script in the test environment Hmm, this would be something very different from the JenkinsRule proposal. I do not think this kind of setup would fly. It would be fine for interactive testing but it would not work well for automated testing. […] steps that depend on generated output from previous steps that we may have mocked out/overridden Perhaps the mocking facility could be used in the JenkinsRule scenario, as a kind of hybrid approach. So you could declare that, for example, all mail steps, or the third node step, or any sh step taking the script argument mvn clean test , etc., should print the following output, or return the following result object, or fail with the following error message. Unmentioned steps would run “for real” by default, or you could declare that by default these would fail the test, but you could explicitly ask for some to follow the real behavior. Some of these facilities would actually be useful in Pipeline plugin tests, too, which would help validate the concept and iron out bugs.
          Jesse Glick made changes -
          Epic Link New: JENKINS-35396 [ 171189 ]

          Shijun Kong added a comment -

          I believe a lightweight mock pipeline engine is better than embedded jenkins approach. For the speed of feedback.

          I like externalize build process in Jenkinsfile, which I could put into VCS. However, the burden comes when trying out new "step", make sure both syntax and semantic are correct. The entire feedback loop today is too long: commit to VCS, Jenkins pulling the updated Jenkinsfile, running the build for multiple stages, and finally hit the new step, failed.

          A

          Shijun Kong added a comment - I believe a lightweight mock pipeline engine is better than embedded jenkins approach. For the speed of feedback. I like externalize build process in Jenkinsfile, which I could put into VCS. However, the burden comes when trying out new "step", make sure both syntax and semantic are correct. The entire feedback loop today is too long: commit to VCS, Jenkins pulling the updated Jenkinsfile, running the build for multiple stages, and finally hit the new step, failed. A
          R. Tyler Croy made changes -
          Workflow Original: JNJira [ 169936 ] New: JNJira + In-Review [ 183698 ]

            Unassigned Unassigned
            jglick Jesse Glick
            Votes:
            120 Vote for this issue
            Watchers:
            136 Start watching this issue

              Created:
              Updated: