-
Bug
-
Resolution: Not A Defect
-
Blocker
-
Jenkins version: 2.462.3
Docker version: 20.10.23
The error
As a docker cloud agent, I am running a docker container, with the corresponding Dockerfile:
FROM oraclelinux:7-slim RUN yum -y install oracle-release-el7 && \ yum -y install oracle-instantclient19.8-basic oracle-instantclient19.8-sqlplus && \ yum clean all RUN curl -L -o /tmp/jdk-17_linux-x64_bin.rpm \ https://download.oracle.com/java/17/archive/jdk-17_linux-x64_bin.rpm && \ yum -y localinstall /tmp/jdk-17_linux-x64_bin.rpm && \ rm -f /tmp/jdk-17_linux-x64_bin.rpm && \ yum clean all ENV LD_LIBRARY_PATH=/usr/lib/oracle/19.8/client64/lib RUN mkdir -p /my/path WORKDIR /my/path RUN adduser -u 8877 -m myuser USER myuser CMD ["/bin/bash"]
And the behavior is that multiple instances of this docker image keep creating and running. But the job (attached to the docker cloud agent with a label) gets stucked on
Still waiting to schedule task
Waiting for next available executor
In fact, in the docker cloud agent logs, I can find:
com.github.dockerjava.api.exception.InternalServerErrorException: Status 500: {"message":"unable to find user docker run --user myuser: no matching entries in passwd file"} at PluginClassLoader for docker-java-api//com.github.dockerjava.core.DefaultInvocationBuilder.execute(DefaultInvocationBuilder.java:247) at PluginClassLoader for docker-java-api//com.github.dockerjava.core.DefaultInvocationBuilder.post(DefaultInvocationBuilder.java:102) at PluginClassLoader for docker-java-api//com.github.dockerjava.core.exec.StartContainerCmdExec.execute(StartContainerCmdExec.java:31) at PluginClassLoader for docker-java-api//com.github.dockerjava.core.exec.StartContainerCmdExec.execute(StartContainerCmdExec.java:13) at PluginClassLoader for docker-java-api//com.github.dockerjava.core.exec.AbstrSyncDockerCmdExec.exec(AbstrSyncDockerCmdExec.java:21) at PluginClassLoader for docker-java-api//com.github.dockerjava.core.command.AbstrDockerCmd.exec(AbstrDockerCmd.java:33) at PluginClassLoader for docker-java-api//com.github.dockerjava.core.command.StartContainerCmdImpl.exec(StartContainerCmdImpl.java:42) at PluginClassLoader for docker-plugin//com.nirima.jenkins.plugins.docker.DockerTemplate.doProvisionNode(DockerTemplate.java:755) at PluginClassLoader for docker-plugin//com.nirima.jenkins.plugins.docker.DockerTemplate.provisionNode(DockerTemplate.java:686) at PluginClassLoader for docker-plugin//com.nirima.jenkins.plugins.docker.DockerCloud$1.run(DockerCloud.java:414) at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28) at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:68) at jenkins.util.ErrorLoggingExecutorService.lambda$wrap$0(ErrorLoggingExecutorService.java:51) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at java.base/java.util.concurrent.FutureTask.run(Unknown Source) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.base/java.lang.Thread.run(Unknown Source)
And on Nodes logs:
Connecting to docker container 28b538b5567c60dff1f4aa1e27b0a4d197917ec49b02834dbf19e990ecd277c0, running command java -jar /my/path/remoting-3248.3250.v3277a_8e88c9b_.jar -noReconnect -noKeepAlive -agentLog /my/path/agent.log HTTP/1.1 101 UPGRADED Content-Type: application/vnd.docker.raw-stream Connection: Upgrade Upgrade: tcp Api-Version: 1.41 Docker-Experimental: false Ostype: linux Server: Docker/20.10.23 (linux) ERROR: Unexpected error in launching an agent. This is probably a bug in Jenkins Also: java.lang.Throwable: launched here at hudson.slaves.SlaveComputer._connect(SlaveComputer.java:286) at hudson.model.Computer.connect(Computer.java:451) at PluginClassLoader for docker-plugin//com.nirima.jenkins.plugins.docker.strategy.DockerOnceRetentionStrategy.start(DockerOnceRetentionStrategy.java:145) at PluginClassLoader for docker-plugin//com.nirima.jenkins.plugins.docker.strategy.DockerOnceRetentionStrategy.start(DockerOnceRetentionStrategy.java:49) at hudson.model.AbstractCIBase.createNewComputerForNode(AbstractCIBase.java:192) ...
Note the "This is probably a bug in Jenkins"...
What I've tried
1. Running the container as root (so deleting the 7th and 8th row on the Dockerfile), and it is provisioned successfully.
2. Running the container as 'nobody' (by typing the user in Jenkins cloud agent UI settings, Docker Agent templates -> Container settings -> User; or Docker Agent templates -> Connect method -> User), and the behavior is the same as when I run the cloud agent with user "myuser".
(user nodoby exists in /etc/group, I can see it when i simply run my container on terminal with docker run command)
The Jenkins project stopped supporting Red Hat Enterprise Linux 7 and its derivatives (like Oracle Linux 7) in November 2023 as announced in a blog post. Does the same failure happen with Oracle Linux 8 or Oracle Linux 9?
The request seems to be more about asking for help to diagnose an agent startup issue. You may find more people will read your request for diagnostic help if you post on https://community.jenkins.io or if you try the Jenkins user mailing list. More people read those locations than read individual issue reports here.
If you believe that this is a Jenkins bug, please provide the additional details requested in "How to report an issue". That increases the chances that someone will volunteer their time to try to duplicate the issue that you are seeing.