-
Bug
-
Resolution: Fixed
-
Major
-
None
-
Jenkins ver. 2.201 (yum installed, master node only)
durable-task plugin v1.31
os.arch: s390x
os.name: Linux (RedHat)
os.version: 3.10.0-327.el7.s390x
-
Powered by SuggestiMate -
1.33
After upgrading to v1.31, the first sh step in a pipeline gets stuck. After few minutes Console Output shows:
[Pipeline] sh (Get email of the author of last commit)
process apparently never started in /data/jenkins/workspace/TG2_PTG2_-_pipeline_build_master@tmp/durable-be2cf2a6
(running Jenkins temporarily with -Dorg.jenkinsci.plugins.durabletask.BourneShellScript.LAUNCH_DIAGNOSTICS=true might make the problem clearer)
Cannot contact : java.io.FileNotFoundException: File '/data/jenkins/workspace/TG2_PTG2_-_pipeline_build_master@tmp/durable-be2cf2a6/output.txt' does not exist
Eventually, I discovered that a new binary was added in the latest version of this plugin. The script compile-binaries.sh in GitHub suggests that the binary is only built for Linux and MacOS.
Sure enough, when I try to execute the binary myself on an architecture other than amd64, I get:
-bash: /data/jenkins/caches/durable-task/durable_task_monitor_1.31_unix_64: cannot execute binary file
Are other architectures or operating systems (Windows) not supported anymore?
- is duplicated by
-
JENKINS-60065 durable_task_monitor_1.31_unix_32: Syntax error: "(" unexpected
-
- Closed
-
- relates to
-
JENKINS-60065 durable_task_monitor_1.31_unix_32: Syntax error: "(" unexpected
-
- Closed
-
[JENKINS-59907] sh steps stuck indefinitely on uncommon architectures (e.g. s390x)
I was also hit by this. Rolling back the durable task plugin to 1.30 got us back up and running.
– edit:
OS: Centos 7.6 and 7.7 64bit hosts
Architecture: x86_64
Jenkins 2.190.1 docker
Issue started after upgrading the durable task plugin today to 1.31.
I think this issue's title is just one of the symptoms of the new "Durable Task" plugin v1.31 bug:
- Latest version of Jenkins core (v2.201) running on Ubuntu 16.04
- In on of our pipelines we do not use Docker at all and just exeucte an "sh" step (on Jenkins master
=> psst!) via ssh on another server and it fails right away:
- Pipeline code:
... boolean skipRenewal = true script.xortex.skipableStage("check need for renewal") { def exitCode = script.sh(returnStatus: true, script: makeRenewLocalCertShellCmd(config.domainName)) // ! This is line 54 where the exception happens ... private static String makeRenewLocalCertShellCmd(String domainName) { return "ssh -t -t MyServer.ACME.com './renewlocalcert ${domainName}'" }
- Build log (proprietary) with nice exception stacktrace at the end:
... [Pipeline] // stage [Pipeline] stage [Pipeline] { (check need for renewal) [Pipeline] sh [Pipeline] } [Pipeline] // stage [Pipeline] } [Pipeline] // node [Pipeline] ansiColor [Pipeline] { [Pipeline] echo 06:37:13 Pipeline problem: Build failed (check need for renewal) due to: "java.io.IOException: Cannot run program "/var/lib/jenkins/caches/durable-task/durable_task_monitor_1.31_unix_64" (in directory "/var/lib/jenkins/workspace/Sandbox/ACME.renewCerts"): error=13, Permission denied" => Please check the "Console/Log Output" => Failure notification will be sent... ... java.io.IOException: error=13, Permission denied at java.lang.UNIXProcess.forkAndExec(Native Method) at java.lang.UNIXProcess.<init>(UNIXProcess.java:247) at java.lang.ProcessImpl.start(ProcessImpl.java:134) at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029) at hudson.Proc$LocalProc.<init>(Proc.java:250) at hudson.Proc$LocalProc.<init>(Proc.java:219) at hudson.Launcher$LocalLauncher.launch(Launcher.java:937) at hudson.Launcher$ProcStarter.start(Launcher.java:455) at org.jenkinsci.plugins.durabletask.BourneShellScript.launchWithCookie(BourneShellScript.java:230) at org.jenkinsci.plugins.durabletask.FileMonitoringTask.launch(FileMonitoringTask.java:99) at org.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep$Execution.start(DurableTaskStep.java:317) at org.jenkinsci.plugins.workflow.cps.DSL.invokeStep(DSL.java:286) at org.jenkinsci.plugins.workflow.cps.DSL.invokeMethod(DSL.java:179) at org.jenkinsci.plugins.workflow.cps.CpsScript.invokeMethod(CpsScript.java:122) at org.codehaus.groovy.runtime.callsite.PogoMetaClassSite.call(PogoMetaClassSite.java:48) at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:48) at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:113) at com.cloudbees.groovy.cps.sandbox.DefaultInvoker.methodCall(DefaultInvoker.java:20) Caused: java.io.IOException: Cannot run program "/var/lib/jenkins/caches/durable-task/durable_task_monitor_1.31_unix_64" (in directory "/var/lib/jenkins/workspace/Sandbox/ACME.renewCerts"): error=13, Permission denied at java.lang.ProcessBuilder.start(ProcessBuilder.java:1048) at hudson.Proc$LocalProc.<init>(Proc.java:250) at hudson.Proc$LocalProc.<init>(Proc.java:219) at hudson.Launcher$LocalLauncher.launch(Launcher.java:937) at hudson.Launcher$ProcStarter.start(Launcher.java:455) at org.jenkinsci.plugins.durabletask.BourneShellScript.launchWithCookie(BourneShellScript.java:230) at org.jenkinsci.plugins.durabletask.FileMonitoringTask.launch(FileMonitoringTask.java:99) at org.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep$Execution.start(DurableTaskStep.java:317) at org.jenkinsci.plugins.workflow.cps.DSL.invokeStep(DSL.java:286) at org.jenkinsci.plugins.workflow.cps.DSL.invokeMethod(DSL.java:179) at org.jenkinsci.plugins.workflow.cps.CpsScript.invokeMethod(CpsScript.java:122) at org.codehaus.groovy.runtime.callsite.PogoMetaClassSite.call(PogoMetaClassSite.java:48) at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:48) at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:113) at com.cloudbees.groovy.cps.sandbox.DefaultInvoker.methodCall(DefaultInvoker.java:20) at com.ACME.renewcerts.RenewCertsBuild.build(RenewCertsBuild.groovy:54) at com.ACME.stage.SkipableStage.execute(SkipableStage.groovy:59) at ___cps.transform___(Native Method) at com.cloudbees.groovy.cps.impl.ContinuationGroup.methodCall(ContinuationGroup.java:84) at com.cloudbees.groovy.cps.impl.FunctionCallBlock$ContinuationImpl.dispatchOrArg(FunctionCallBlock.java:113) at com.cloudbees.groovy.cps.impl.FunctionCallBlock$ContinuationImpl.fixArg(FunctionCallBlock.java:83) at sun.reflect.GeneratedMethodAccessor267.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at com.cloudbees.groovy.cps.impl.ContinuationPtr$ContinuationImpl.receive(ContinuationPtr.java:72) at com.cloudbees.groovy.cps.impl.CollectionLiteralBlock$ContinuationImpl.dispatch(CollectionLiteralBlock.java:55) at com.cloudbees.groovy.cps.impl.CollectionLiteralBlock$ContinuationImpl.item(CollectionLiteralBlock.java:45) at sun.reflect.GeneratedMethodAccessor284.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at com.cloudbees.groovy.cps.impl.ContinuationPtr$ContinuationImpl.receive(ContinuationPtr.java:72) at com.cloudbees.groovy.cps.impl.FunctionCallBlock$ContinuationImpl.dispatchOrArg(FunctionCallBlock.java:107) at com.cloudbees.groovy.cps.impl.FunctionCallBlock$ContinuationImpl.fixArg(FunctionCallBlock.java:83) at sun.reflect.GeneratedMethodAccessor267.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at com.cloudbees.groovy.cps.impl.ContinuationPtr$ContinuationImpl.receive(ContinuationPtr.java:72) at com.cloudbees.groovy.cps.impl.ContinuationGroup.methodCall(ContinuationGroup.java:87) at com.cloudbees.groovy.cps.impl.FunctionCallBlock$ContinuationImpl.dispatchOrArg(FunctionCallBlock.java:113) at com.cloudbees.groovy.cps.impl.FunctionCallBlock$ContinuationImpl.fixArg(FunctionCallBlock.java:83) at sun.reflect.GeneratedMethodAccessor267.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at com.cloudbees.groovy.cps.impl.ContinuationPtr$ContinuationImpl.receive(ContinuationPtr.java:72) at com.cloudbees.groovy.cps.impl.ConstantBlock.eval(ConstantBlock.java:21) at com.cloudbees.groovy.cps.Next.step(Next.java:83) at com.cloudbees.groovy.cps.Continuable$1.call(Continuable.java:174) at com.cloudbees.groovy.cps.Continuable$1.call(Continuable.java:163) at org.codehaus.groovy.runtime.GroovyCategorySupport$ThreadCategoryInfo.use(GroovyCategorySupport.java:129) at org.codehaus.groovy.runtime.GroovyCategorySupport.use(GroovyCategorySupport.java:268) at com.cloudbees.groovy.cps.Continuable.run0(Continuable.java:163) at org.jenkinsci.plugins.workflow.cps.SandboxContinuable.access$001(SandboxContinuable.java:18) at org.jenkinsci.plugins.workflow.cps.SandboxContinuable.run0(SandboxContinuable.java:51) at org.jenkinsci.plugins.workflow.cps.CpsThread.runNextChunk(CpsThread.java:186) at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.run(CpsThreadGroup.java:370) at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.access$200(CpsThreadGroup.java:93) at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:282) at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:270) at org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService$2.call(CpsVmExecutorService.java:66) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at hudson.remoting.SingleLaneExecutorService$1.run(SingleLaneExecutorService.java:131) at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28) at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:59) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)
This is NOT related to the other two opened issues with v1.31. Those two are likely caused by the new binary durable_task_monitor not being available from within Docker containers.
But this issue is caused by the binary not being executable on my HW architecture.
I'll clarify the title.
I'm seeing this issue on FreeBSD 11.1 (64-bit) as well.
$ ./durable_task_monitor_1.31_unix_64
ELF binary type "0" not known.
bash: ./durable_task_monitor_1.31_unix_64: cannot execute binary file: Exec format error
Downgrading to version 1.30 works.
Sorry this took so long to address. The binary was not intended to run on non-x86 architectures. Instead, when a non-x86 and non-*NIX architecture is detected, the original shell wrapper was supposed to launch the script. I have a PR up right now that is changing that behavior. I will also update the changelog (that is currently being migrated to github) for this information.
The PR can be found here: https://github.com/jenkinsci/durable-task-plugin/pull/114
UPDATE: more work is being done to this PR to handle a few more cases (such as freebsd)
We have a similar problem on AIX 7.2.
The binary to start is of type ELF.
Just for the record, this issue will also affect Solaris, NetBSD and OpenBSD, regardless of the architecture.
As a side note, the name of the binary is misleading. It should be called durable_task_monitor_X.YZ_linux_amd64, because that is the only OS it can run natively.
I am running an agent on ppc64le and ran into this issue. I will attempt to revert to 1.3.0.
Thank you, Jenkins team, for your ongoing support.
Version 1.33 has now been released. There is stricter checking on the platforms it runs on. I know not every case has been covered here. The binary is disabled by default so behavior should be simliar, if not same to 1.30
The issue is still there with v1.33
logs:
https://gist.github.com/rahul-raj/ddeaa1407827f191e0d1b94966a58a0b
Version from my Jenkins plugin list:
https://user-images.githubusercontent.com/517415/75545150-8a721200-5a4b-11ea-8b49-02d69669184f.png
Please suggest on a fix for this.
rahulraj90 it appears you are using x86, and not an "uncommon architecture" like this ticket is describing. I would probably advise setting LAUNCH_DIAGNOSTICS=true as suggested in the output log and that can tell us better. The default behavior for this plugin should be using the original script wrappers. If you can't ascertain what is going there, I would probably post this to jenkinsci-users mailing list while we're still investigating.
I've found this also reproduces when using build agents in Kubernetes. The problem here is that Kubernetes launches two containers into a pod with a shared mount: a JNLP slave container, which Jenkins does have permission to write the cache directory in, and a build container (in my case kubectl, but could be any container without a Jenkins user) where it does not necessarily have the same permission, in which code actually runs. The plugin runs its test inside the JNLP container, enables the wrapper, and then exhibits the same hanging behavior when commands are run in the kubectl container.
In the JNLP container:
bash-4.4$ cd /home/jenkins/agent/caches bash-4.4$ ls -l total 0 drwxr-xr-x 2 jenkins jenkins 6 Mar 6 15:47 durable-task
In the kubectl container:
I have no name!@<REDACTED>:/home/jenkins/agent/caches$ ls -l total 0 drwxr-xr-x 2 1000 1000 6 Mar 6 15:47 durable-task I have no name!@<REDACTED>:/home/jenkins/agent/caches$ id uid=1001 gid=0(root) groups=0(root)
don_code can we move this over to JENKINS-59903? That would be the relevant ticket. Could you also confirm that you are running v1.33? and not v1.31-32
I'm still running into this same issue, whilst trying to run a pyinstaller docker image.
The Sh command gives me the following:
```
ERROR: The container started but didn't run the expected command. Please double check your ENTRYPOINT does execute the command passed as docker run argument, as required by official docker images (see https://github.com/docker-library/official-images#consistency for entrypoint consistency requirements).
Alternatively you can force image entrypoint to be disabled by adding option `--entrypoint=''`.
[Pipeline]
$ docker stop --time=1 0d2e194b04bb5a12016da7f4dd92019127837debf082d18ba9fdf4cbbf6abbd7
$ docker rm -f 0d2e194b04bb5a12016da7f4dd92019127837debf082d18ba9fdf4cbbf6abbd7
[Pipeline] // withDockerContainer
[Pipeline] }
[Pipeline] // withEnv
[Pipeline] }
[Pipeline] // node
[Pipeline] }
[Pipeline] // stage
[Pipeline] End of Pipeline
ERROR: script returned exit code -2
Finished: FAILURE
```
Running latest jenkins/blueocean image and I'm currently using the following set up for my Jenkinsfile.
```
pipeline { |
agent none |
options {| |skipStagesAfterUnstable()| |} |
stages { |
stage('Build') { |
agent { |
docker {| |image 'python:2-alpine'| |} |
} |
steps {| |sh 'python -m py_compile sources/add2vals.py sources/calc.py'| |} |
} |
stage('Test') { |
agent { |
docker {| |image 'qnib/pytest'| |} |
} |
steps {| |sh 'py.test --verbose --junit-xml test-reports/results.xml sources/test_calc.py'| |} |
post { |
always {| |junit 'test-reports/results.xml'| |} |
} |
} |
stage('Deliver') { |
agent { |
docker {| |image 'cdrx/pyinstaller-linux:python2'| |args 'docker run -v "/var/jenkins_home/workspace/simple-python-pyinstaller-app/sources:/src/" --name pyinstaller --entrypoint= cdrx/pyinstaller-linux:python2'| |} |
} |
steps {| |sh 'pyinstaller --onefile sources/add2vals.py'| |} |
post { |
success {| |archiveArtifacts 'dist/add2vals'| |} |
} |
} |
} |
}
```
I managed to hit this using dir(). This is clearly a bug and causes people to write workarounds.
At the second stage, using dir(), it gets stuck for around 6-7 minutes and eventually it fails with "process apparently never started in /opt@tmp/durable-5a20a76a".
pipeline { agent { docker { label '********' image '**********' registryUrl '************' registryCredentialsId '*******' args '--user root:root' } } stages { stage('dir-testing') { stages { stage('without dir') { steps { sh 'cd /opt && ls -l' } } stage('with dir') { steps { dir('/opt') { sh 'ls -l' } } } } post { always { cleanWs() } } } } }
Started by user ********** Running in Durability level: MAX_SURVIVABILITY [Pipeline] Start of Pipeline [Pipeline] node Running on ************ in /var/jenkins/workspace/test-cwd-bug [Pipeline] { [Pipeline] withEnv [Pipeline] { [Pipeline] withDockerRegistry Using the existing docker config file.Removing blacklisted property: auths$ docker login -u ******** -p ******** ********* WARNING! Using --password via the CLI is insecure. Use --password-stdin. Login Succeeded [Pipeline] { [Pipeline] isUnix [Pipeline] sh + docker inspect -f . ********** Error: No such object: *********** [Pipeline] isUnix [Pipeline] sh + docker inspect -f . **************** . [Pipeline] withDockerContainer ************* does not seem to be running inside a container $ docker run -t -d -u 0:0 --user root:root -w /var/jenkins/workspace/test-cwd-bug -v /var/jenkins/workspace/test-cwd-bug:/var/jenkins/workspace/test-cwd-bug:rw,z -v /var/jenkins/workspace/test-cwd-bug@tmp:/var/jenkins/workspace/test-cwd-bug@tmp:rw,z -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** ********************* cat $ docker top 969a08a99a24c314d5d80f2cbf77920db4e269524d1af6738c0ddc5417da3f16 -eo pid,comm [Pipeline] { [Pipeline] stage [Pipeline] { (dir-testing) [Pipeline] stage [Pipeline] { (without dir) [Pipeline] sh + cd /opt + ls -l total 8 drwxr-xr-x 4 root root 4096 Jul 24 15:29 artifactory-scripts drwxr-xr-x 1 608 500 4096 Jul 24 15:39 cv25_linux_sdk_2.5 [Pipeline] } [Pipeline] // stage [Pipeline] stage [Pipeline] { (with dir) [Pipeline] dir Running in /opt [Pipeline] { [Pipeline] sh process apparently never started in /opt@tmp/durable-5a20a76a (running Jenkins temporarily with -Dorg.jenkinsci.plugins.durabletask.BourneShellScript.LAUNCH_DIAGNOSTICS=true might make the problem clearer) [Pipeline] } [Pipeline] // dir [Pipeline] } [Pipeline] // stage Post stage [Pipeline] cleanWs [WS-CLEANUP] Deleting project workspace... [WS-CLEANUP] Deferred wipeout is used... [WS-CLEANUP] done [Pipeline] } [Pipeline] // stage [Pipeline] } $ docker stop --time=1 969a08a99a24c314d5d80f2cbf77920db4e269524d1af6738c0ddc5417da3f16 $ docker rm -f 969a08a99a24c314d5d80f2cbf77920db4e269524d1af6738c0ddc5417da3f16 [Pipeline] // withDockerContainer [Pipeline] } [Pipeline] // withDockerRegistry [Pipeline] } [Pipeline] // withEnv [Pipeline] } [Pipeline] // node [Pipeline] End of Pipeline ERROR: script returned exit code -2 Finished: FAILURE
smirky can you tell me what architecture your agent is running on? The thing is, if you are using 1.33 or greater, you should be running the traditional shell-based durable-task where there would not be any issues in running on a non-windows/non-unix architecture.
It appears you are using a Unix architecture so this actually might not be the right ticket you're looking for.
This ticket was also incorrectly reopened by a user reporting an issue with an x86 architecture that was not related to this ticket. Closing this ticket again. Will reopen if we discover new issues relating to non-unix/non-windows architectures.
The issue was incorrectly reopened in the past. Closing again until we encounter a bug related to non-windows/non-unix architecture
carroll - Indeed, it is x64 Ubuntu 18.04. Since you mentioned that this ticket is not suitable for my use-case, I opened a new one, describing the problem:
https://issues.jenkins-ci.org/browse/JENKINS-63253
I confirm it affects me as well:
Jenkins ver. 2.190.1, docker, kubernetes and other plugins: latest stable version.
Jenkins runs on linux (inside a docker container)