-
New Feature
-
Resolution: Unresolved
-
Major
-
Powered by SuggestiMate
It would be desirable to have a standard mechanism for testing Pipeline scripts without running them on a production server. There are two competing suggestions:
Mock framework
Inspired by Job DSL (example).
We could set up a GroovyShell in which step functions and global variables were predefined as mocks (in a fashion similar to Powermock, but easier in Groovy given its dynamic nature), and stub out the expected return value / exception for each, with some standard predefinitions such as for currentBuild.
Ideally the shell would be CPS-transformed, with the program state serialized and then reloaded between every continuation (though this might involve a lot of code duplication with workflow-cps).
Should be easy to pass it through the Groovy sandbox (if requested), though the live Whitelist.all from Jenkins would be unavailable, so we would be limited to known static whitelists, the @Whitelisted annotation, and perhaps some custom additions.
Quick and flexible, but low fidelity to real behavior.
JenkinsRule-style
Use an embedded Jenkins server, as per JenkinsRule in jenkins-test-harness, and actually create a WorkflowJob with the specified definition. Can use for example mock-slave to create nodes.
Need to have a "dry-run" flag so that attempts to do things like deploy artifacts or send email do not really take action. This could perhaps be a general API in Jenkins core, as it would be useful also for test instances (shadows of production servers), acceptance-test-harness, etc.
Slower to run (seconds per test case rather than milliseconds), and trickier to set up, but much more realistic coverage. The tests for Pipeline (and Pipeline steps) themselves use this technique.
[JENKINS-33925] Test framework for Jenkinsfile
I've been thinking about this a bit more - some thoughts on what I'd like to see in the end result:
- An easy way to run a test - I think my ideal would be to have a job type that was basically identical to the existing Pipeline job type (with both inline and from-SCM options) that would run the Pipeline script in the test environment.
- Guaranteeing no side effects - this is the hairy one, obviously. I'm not sure how we'd be able to make sure everything that could have a side effect doesn't - i.e., would we need to interfere and override sh and bat? And def foo = docker.build ... ; foo.push ...? And who knows what else?
- Going along with that - how do we deal with things like testing if errors happened, or steps that depend on generated output from previous steps that we may have mocked out/overridden?
a job type that was basically identical to the existing Pipeline job type (with both inline and from-SCM options) that would run the Pipeline script in the test environment
Hmm, this would be something very different from the JenkinsRule proposal. I do not think this kind of setup would fly. It would be fine for interactive testing but it would not work well for automated testing.
[…] steps that depend on generated output from previous steps that we may have mocked out/overridden
Perhaps the mocking facility could be used in the JenkinsRule scenario, as a kind of hybrid approach. So you could declare that, for example, all mail steps, or the third node step, or any sh step taking the script argument mvn clean test, etc., should print the following output, or return the following result object, or fail with the following error message. Unmentioned steps would run “for real” by default, or you could declare that by default these would fail the test, but you could explicitly ask for some to follow the real behavior.
Some of these facilities would actually be useful in Pipeline plugin tests, too, which would help validate the concept and iron out bugs.
I believe a lightweight mock pipeline engine is better than embedded jenkins approach. For the speed of feedback.
I like externalize build process in Jenkinsfile, which I could put into VCS. However, the burden comes when trying out new "step", make sure both syntax and semantic are correct. The entire feedback loop today is too long: commit to VCS, Jenkins pulling the updated Jenkinsfile, running the build for multiple stages, and finally hit the new step, failed.
A
Has anyone made any movement on this front? Whether related to this ticket or elsewhere?
JenkinsPipelineUnit, developed recently, looks promising. I have not personally used or reviewed it.
I have used JenkinsPipelineUnit to unit test declarative pipelines and shared code (DSL style implementations in 'vars') with Spock based unit tests in a Gradle project for some work I am doing. I needed to extend one of the classes to register many of the declarative pipeline methods but I have had good success with it. To write unit tests that assert that say certain builds or things are called you need to be well versed with Groovy and how to mock / assert closure calls with Spock.
billdennis, would it be possible to share your setup on github? Maybe we can derive a minimalistic example project of it.
icereed What I have is in a private repo but I can try and create a simple minimal gradle project in my github account if I can get time this week.
This would be fantastic!
I think this could be a very good starting point for everybody who wants to ensure quality in a pipeline.
icereed OK - here is what I have on my Github: https://github.com/macg33zr/pipelineUnit. It my be a bit raw but should give a good leg-up. Gradle project is unit testing the Jenkinsfile of the Gradle project with Spock and Groovy.
--Bill
billdennis A big +1 and thank you. I'm still figuring out the quirks, but this is a massive improvement over the cobbled-together tests I had before.
So folks, what the status? what the best-practise soluthion ?
with quick-search i found at least:
1) https://github.com/lesfurets/JenkinsPipelineUnit
2) From this thread billdennis icereed > https://github.com/macg33zr/pipelineUnit
What should i try, as end-user?
(Thats kinda really important issue - strange that its so unattended...)
alexz : The project of billdennis also uses the JenkinsPipelineUnit. It gives a nice example for testing a declarative pipeline using Gradle and Spock.
As developers of the JenkinsPipelineUnit, we are trying to give a minimum of support for users. Note that it is not feature complete (ie. Declarative Pipeline support, Scripts using custom plugin DSLs such as docker), but it covers a great deal of testing scenarios.
Keep in mind that we took a pragmatic approach for testing scripts and it adheres to the first option (Mock Framework) described in this ticket. As said, it is quick, flexible but has low fidelity to real behavior.
Just an update on my experience with https://github.com/lesfurets/JenkinsPipelineUnit. I cannot emphasise enough how useful this framework has been and the time it has saved (many thanks to ozangunalp for the work on this). I have recently refactored some Jenkins automation pipelines with the development work being done offline using a testing framework built around Spock, Gradle and JenkinsPipelineUnit working in IntelliJ IDEA. I'd say 80-90% of the development / test / debug work was done offline from Jenkins.
At the point I run the pipelines in Jenkins these are the sort of issues I am coming up against:
- Syntax issues with the pipelines that are not picked up by the framework - things like non-DSL code not in script sections but still valid Groovy. I'd really like to be able to do syntax validation before commit in my development IDE. I am looking to fix this by building some Gradle tasking that can use the Jenkins CLI / pipeline linter to validate pipelines to a remote Jenkins server (or spin one up in a Docker). I am also looking to run validation as part of the CI process for the pipeline tests (there is a validate pipeline step for this). It will solve it for the pipelines but not necessarily for shared library code syntax checking (I don't think the linter supports shared libs?).
- Script approvals in Jenkins. It is quite easy to use benign looking Java/Groovy constructs that fail on script approvals. Date parsing for example. We don't apply shared libraries globally so this remains an issue for us (there may be other pipelines running on the server we don't want to expose to our libs). Some way to more easily apply approvals than having to get past every exception individually would be useful. Trying to write code that avoids the issue is best!
- Groovy and some Java constructs that are not supported in Jenkins pipeline - have to remember that pipeline is not pure Groovy. For this one and the previous I am thinking about writing custom plugins that provide our own pipeline DSL / support code libraries.
- Things like the order of execution in pipeline post handlers. The tests execute post handlers in a defined order - say 'always' then 'failure'. You cannot rely on something happening in one handler that another depends on. If things slip through, tests may pass but Jenkins breaks (my test framework is executing the post sections in the order they appear in the Jenkinsfile, I think).
- Over-complex error handling in the pipelines and stages. I am finding it best to simply use a error('something failed') step to fail a stage if required, put error handling in stage 'post { failure {} }' sections rather than complex 'try-catch' and break things that can fail into separate shorter stages. It is easier to test the failure scenarios. Using declarative pipeline is good due to the post handlers, can always drop into script {} sections to do complex stuff.
The linter supports only Declarative Pipeline, not general scripts, much less libraries.
validate pipelines to a remote Jenkins server (or spin one up in a Docker)
Well this is most easily done with the JenkinsRule-style approach mentioned in the original issue description, which is really complementary to unit testing.
I'd guess it is going to be resolved by hosting PipelineUnit as a part of the Jenkins project: https://groups.google.com/forum/#!topic/jenkinsci-dev/VTHyvXqQ0s8
Well that is the first option, with all its advantages and drawbacks. I suppose there is not much practical purpose in keeping this open, anyway.
Jenkinsfile Runner was recently published, which attacks the problem from the opposite end of the spectrum, literally using JenkinsRule internally.
The latest version of Jenkinsfile Runner does not use JenkinsRule anymore, but yes it is one of the possible approaches to testing nowadays
The Jenkins Pipeline docs only refer to JenkinsPipelineUnit.
On Jenkins World 2018 there was a talk about jenkins-spock.
(see https://www.youtube.com/watch?v=4PZ-UFBexIE )
I would like to find out, which library is the best choice, as I am starting to unittest my Jenkins Pipelines just now.
jenkinsfile-runner-test-framework can be used for this purpose, again if you are interested in something closer to an integration test than a unit test.
Thanks! I think unit tests are the best way to test the behavior of my custom steps. I just want to check, if all mocked steps are called as expected.
For that scenario 'jenkins-spock' seems to be a notable alternative to JenkinsPipelineUnit.
If you agree, it might be a good idea to mention it in the docs. Might be interesting, especially for Spock users.
it might be a good idea to mention it in the docs
If you have field experience doing this successfully, you probably know as much as anyone about the topic, so contributions are welcomed.
sradi81 I suggest bringing up the framework comparison topics in https://jenkins.io/sigs/pipeline-authoring/ . It this the venue with the ongoing discussion about Pipeline development tools. CC abayer bitwiseman
Guys, why we're not unit-testing pipelines directly using JenkinsRule?
I don't think it's worth to support some separated engine to run the pipelines, because they always will have some deviation from the actual engine (for example JenkinsPipelineEngine CPS issue and related CPS issue of find/findAll and the other closure functions that was fixed in workflow-cps-plugin years ago).
I'm checking a way to test like in JenkinsPipelineEngine (same interface) but using JenkinsRule and workflow-cps-plugin and seems it's possible, but probably will require some hooks from CpsScript.invokeMethod (at least I did not found a way without copy to groovy src). When it will work somehow - I will share the changes and hopefully they will work well)
Tip: if a script defines a method named, say, node, it will take precedence over a Step of the same name.
jglick I think the good thing of JenkinsPipelineUnit - user specifically chooses what kind of steps is allowed. Unfortunately just redefine still leaves a way to execute some not described step. I think it's a potential issue, so without this strict rule proper unit tests will be impossible.
Hi jglick,
I just completed my research & implementation of JenkinsRule+Groovy-CPS interceptor Jenkinsfile unit testing framework (link) - right now it's integrated to MPL and it's working, but looks not so good as I thought:
- It uses java path overrides to setup a couple of files to use my invokers: link - not sure it's a good idea... Maybe there is a way to provide some hooks from Jenkins side, wdyt?
- JenkinsRule produces too much logs - there is a way how to disable the logging? My attempts were unsuccessful: link
If you could provide some advice - that will be really great!
Thank you
sparshev Interesting but I doubt maintainers of groovy-sandbox / groovy-cps can take on any testability enhancements—keeping production code running is more than enough work.
jglick Ok, got it. Will work on preparing the changes for groovy-cps unit testing in this intercepting way, if you think it's a good idea.
Ok, just added issue and PR with a sample realization to groovy-cps. Hopefully we will find some way to simplify the execution of JenkinsRule-based unit tests for the shared libraries.
Hi jglick, so I bumped in the hardcode in cps groovy shell here: https://github.com/cloudbees/groovy-cps/pull/107#issuecomment-601904207 - so it's impossible to use GroovyInterceptor dynamically. Maybe there is some another way?
FWIW, it seems there are at least two distinct problems here as far as testing is concerned - one is mocking of the pipeline steps. There are at least two frameworks attempting that already (JenkinsPipelineUnit and Jenkins-Spock). But more importantly there is a need to test code, both Jenkinsfile and more importantly Pipeline Library code, under CPS as what is legitimate Groovy code may fail or behave differently under CPS. While there are limited attempts at it in the above mentioned frameworks, neither can properly execute CPS code - as a result, the code that tested fine under unit testing may fail spectacularly in real world use.
With that in mind, we created a small utility class that bypasses the greatest limitation of the CPS - ability to invoke CPS code from non-CPS (i.e. unit test) context. We paired this with test-time compilation of code with CPS transform enabled as well as ability to compile a string as a CPS Transformed script. This small utility class solves the second issue of reliable unit-testing of pipeline code under CPS.
I would like to share this utility with the community, but what I am wondering is - given the tiny size of the utility (and the fact that it was heavily inspired by groovy-cps's own unit tests, it seems silly to package this separately - does it make sense to submit it to become a part of groovy-cps package? There was a statement early own that maintainers of the groovy-cps may not wish to add testing support to the package - so I am hesitant to submit a PR... On the other side, I would love for someone who actually understands CPS code to review/fix what we cobbled together
Hi mlasevich, that will be great to see the implementation, because my one ( https://github.com/cloudbees/groovy-cps/pull/107 ) is not great... So will be glad to help with testing.
sparshev For the moment I just pushed a (slightly) cleaned up version into github here:
it boils down to two key static methods (heavily inspired by groovy-cps's own unit tests
- CPSUtils.invokeCpsMethod(Object object, String methodName, Object...args) - this invokes a CPS Method on an object by name
- CPSUtils.asCPSScript(String script) - which takes a script as a string and runs it (with optional bindings)
Both are primarily intended for use in Unit Tests. Would be nice to have this be part of standard library to avoid the constant cut-n-paste (and it is a bit tiny to be its own library
I also had a wrapper that can wrap any object and run invokeCPSMethod transparently - but it proved to be a bit more complicated and unstable, so I removed it for time being
mlasevich, thank you for the code. I checked it quickly - so what is the difference between your CPSUtil and JenkinsPipelineUnit https://github.com/jenkinsci/JenkinsPipelineUnit/tree/master/src/main/groovy/com/lesfurets/jenkins/unit/cps ? As far I know it's already using CPS to run the tests... But, for example, when it comes to groovy specifics (like the one I found there: https://github.com/griddynamics/mpl/pull/49#issuecomment-551287490 ) - CPS & Jenkins pipeline starts to behave differently.
I think it's quite critical to actually use Jenkins & plugins (security, steps...) for testing, that's why, for example I use the jenkins class FunctionCallEnv override and jenkins harness to be able to intercept the calls for mocking.
So could you please describe how your script is closer to Jenkins than the JenkinsPipelineUnit's one, maybe I got it wrong?
sparshev - My code is far simpler than what JenkinsPipelineUnit does. It simply allows you to execute CPS code - that is all - remainder is up to the testing framework.
JenkinsPipelineUnit (and JenkinsSpock) tries to do a lot of other things, like mocking the steps and what not. Those are great things, but I had hard time getting either to play nice with CPS Transformed code. At best they allowed loading scripts directly from groovy files, but that seems to bypass the classloaders and makes it very difficult to actually unit test anything - at best you can execute the entire `vars` code as a script, but not isolate and call specific methods in specific classes. We have a great deal of actual classes and proper code in out libraries, and I found no easy way to create an instance of a class and test it in either framework. That does not mean there isn't a way, I just have not found it.
My approach is far more straight forward/simplistic. I enable CPS Transform on compile for all code, so all my classes are already transformed, and then just run my unit tests like I would with any other code. If I need to test code that is CPS transformed, I have to call it via CPSUtils but that is it. I am not running any of the actual steps or any Jenkins code beyond groovy-cps transforms - I can stub out the calls and assume that the actual steps are unit tested in the plugin code that provides them. It is not an end-to-end or integration testing - it is pure unit testing at that point.
Ideally I would love to integrate this functionality with JenkinsPipelineUnit functionality - but for now I am just using pure Spock tests and just mock any steps I need
Ok, got it... Unfortunately it is not enough to complete the blackbox pipeline testing due to the reasons I described before: Jenkins pipeline is far more complex than CPS. It introduces restrictions that is not covered via pure CPS, but if this implementation is working for your case - that's great!)
Yeah, this is for pure unit testing and only of the actual code we wrote - i.e. testing that each bit of our code does exactly what we intended it to do - but it is not intended to test if what we intended to do was the right thing, or if the code we did not write(i.e. Jenkins and plugins) does the right thing. We would need more for that sort of testing, but so far, vast majority of the issues have been purely in our own code - and our primary goal is to make sure that any change we do to that code does not break builds for everyone using this shared code.
That said, regardless of how you test your code, if you are not testing running it under CPS Transform, your tests are mostly worthless. There are many cases where CPS transformed code just does not behave the same way as non-transformed code. :-/
I'm strongly +1 on the JenkinsRule approach. I think the challenge really entirely lies in the "dry-run" matter - how do we determine what behaviors are the ones we should mock/skip/whatever we do? How do we mock/skip/whatever those behaviors? How is that extensible and easy for plugins to support?