-
Bug
-
Resolution: Unresolved
-
Major
-
None
-
jenkins:latest Docker container master, no slaves
Docker 1.8.2 on CentOS Linux release 7.1.1503 (Core)
Jenkins 1.642.1
Docker Pipeline plugin 1.13
Docker Commons plugin 1.2
FROM jenkins:latest
COPY plugins.txt /tmp/plugins.txt
RUN /usr/local/bin/plugins.sh /tmp/plugins.txt
Dockerfile building an image I called jenkins-configured:
USER root
COPY JENKINS_HOME /usr/share/jenkins/ref
RUN chown -R jenkins.jenkins /usr/share/jenkins/ref
RUN apt-get update -qq && apt-get install -y docker.io && apt-get clean
# jenkins accesses docker from host, it has root-equivalent rights
RUN usermod -G 0 -a jenkins
USER jenkins
CMD /usr/local/bin/jenkins.sh
Running jenkins:
docker run --name jenkins-run -p 8080:8080 -p 50000:50000 \
-v /var/jenkins_home:/var/jenkins_home:z\
-v /var/run/docker.sock:/var/run/docker.sock \
--privileged \
jenkins-configuredjenkins:latest Docker container master, no slaves Docker 1.8.2 on CentOS Linux release 7.1.1503 (Core) Jenkins 1.642.1 Docker Pipeline plugin 1.13 Docker Commons plugin 1.2 FROM jenkins:latest COPY plugins.txt /tmp/plugins.txt RUN /usr/local/bin/plugins.sh /tmp/plugins.txt Dockerfile building an image I called jenkins-configured: USER root COPY JENKINS_HOME /usr/share/jenkins/ref RUN chown -R jenkins.jenkins /usr/share/jenkins/ref RUN apt-get update -qq && apt-get install -y docker.io && apt-get clean # jenkins accesses docker from host, it has root-equivalent rights RUN usermod -G 0 -a jenkins USER jenkins CMD /usr/local/bin/jenkins.sh Running jenkins: docker run --name jenkins-run -p 8080:8080 -p 50000:50000 \ -v /var/jenkins_home:/var/jenkins_home:z\ -v /var/run/docker.sock:/var/run/docker.sock \ --privileged \ jenkins-configured
Description:
I have initially encountered this problem when trying to run 'make check' for our software inside a docker container created by the Docker Pipeline plugin, which hanged forever:
http://gitweb.skylable.com/gitweb/?p=sx.git;a=blob;f=Jenkinsfile;h=93a60925efe2fff0e5428e987ec3d24e593b76e9;hb=refs/heads/ci
I have created a minimal testcase for this bugreport's purpose that consists only of a Groovy script without external dependencies, see attached config.xml, and tracked down the problem to this line in docker-workflow plugin:
https://github.com/jenkinsci/docker-workflow-plugin/blob/566738205795b939b72d337557fa3514c141295a/src/main/java/org/jenkinsci/plugins/docker/workflow/WithContainerStep.java#L138
Please provide a way to override the 'cat' command in the Groovy DSL, or update the workflow plugin to avoid the zombie issue.
Steps to reproduce:
1. use attached config.xml to create a new Pipeline project (I called it docker-zombies)
2. press build now
3. watch console output of the build
Expected results:
job finishes
Actual results:
job runs forever
Additional information:
The job waits for a process to finish gracefully by checking if the PID is still alive, using kill -0. (since the process is not a direct child it cannot use the 'wait' ).
The PID has exited, and becomes a zombie process (<defunct>) that is supposed to be reaped by PID 1. However because PID 1 (the first process started in the docker container) is 'cat', which doesn't know how to reap children, the zombie processes stays around forever.
This is a well known problem with Docker: https://blog.phusion.nl/2015/01/20/docker-and-the-pid-1-zombie-reaping-problem/, and the usual solution is to use a shell such as bash to run as PID 1 (and run any commands you want like cat as children).
Here is the console output when I run the attached job:
In progressConsole Output
Started by user Admin
[Pipeline] Allocate node : Start
Running on master in /var/jenkins_home/workspace/docker-zombies
[Pipeline] node {
[Pipeline] sh
[docker-zombies] Running shell script
+ docker inspect -f . buildpack-deps:latest
.
[Pipeline] Run build steps inside a Docker container : Start
$ docker run -t -d -u 1000:1000 -w /var/jenkins_home/workspace/docker-zombies -v /var/jenkins_home/workspace/docker-zombies:/var/jenkins_home/workspace/docker-zombies:rw -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** buildpack-deps:latest cat
[Pipeline] withDockerContainer {
[Pipeline] writeFile
[Pipeline] sh
[docker-zombies] Running shell script
+ exec /usr/bin/python test.py
[Pipeline] sh
[docker-zombies] Running shell script
+ ps -ef
UID PID PPID C STIME TTY TIME CMD
1000 1 0 0 16:02 ? 00:00:00 cat
1000 11 1 0 16:02 ? 00:00:00 [python] <defunct>
1000 12 1 0 16:02 ? 00:00:00 [python] <defunct>
1000 45 0 0 16:02 ? 00:00:00 sh -c echo $$ > '/var/jenkins_home/workspace/docker-zombies/.jenkins-3a360d3d/pid'; jsc=durable-0b3272bed0cac7d970b36ad24e8c046c; JENKINS_SERVER_COOKIE=$jsc '/var/jenkins_home/workspace/docker-zombies/.jenkins-3a360d3d/script.sh' > '/var/jenkins_home/workspace/docker-zombies/.jenkins-3a360d3d/jenkins-log.txt' 2>&1; echo $? > '/var/jenkins_home/workspace/docker-zombies/.jenkins-3a360d3d/jenkins-result.txt'
1000 49 45 0 16:02 ? 00:00:00 /bin/sh -xe /var/jenkins_home/workspace/docker-zombies/.jenkins-3a360d3d/script.sh
1000 50 49 0 16:02 ? 00:00:00 ps -ef
[Pipeline] sh
[docker-zombies] Running shell script
+ cat pidfile
+ PID=11
+ kill -0 11
+ echo Waiting for 11 to exit
Waiting for 11 to exit
+ sleep 1
+ kill -0 11
+ echo Waiting for 11 to exit
Waiting for 11 to exit
+ sleep 1
+ kill -0 11
+ echo Waiting for 11 to exit
Waiting for 11 to exit
+ sleep 1
+ kill -0 11
+ echo Waiting for 11 to exit
Waiting for 11 to exit
+ sleep 1
+ kill -0 11
+ echo Waiting for 11 to exit
Waiting for 11 to exit
+ sleep 1
- is related to
-
JENKINS-45888 Image.inside and Agent should use --init with 'docker run'
- Open