Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-73990

Docker cloud agent as non-root unable to be provisioned

       

      The error

      As a docker cloud agent, I am running a docker container, with the corresponding Dockerfile:

      FROM oraclelinux:7-slim
      RUN yum -y install oracle-release-el7 && \
          yum -y install oracle-instantclient19.8-basic oracle-instantclient19.8-sqlplus && \
          yum clean all
      RUN curl -L -o /tmp/jdk-17_linux-x64_bin.rpm \
      https://download.oracle.com/java/17/archive/jdk-17_linux-x64_bin.rpm && \
          yum -y localinstall /tmp/jdk-17_linux-x64_bin.rpm && \
          rm -f /tmp/jdk-17_linux-x64_bin.rpm && \
          yum clean all
      ENV LD_LIBRARY_PATH=/usr/lib/oracle/19.8/client64/lib
      RUN mkdir -p /my/path
      WORKDIR /my/path
      RUN adduser -u 8877 -m myuser
      USER myuser
      CMD ["/bin/bash"]

      And the behavior is that multiple instances of this docker image keep creating and running. But the job (attached to the docker cloud agent with a label) gets stucked on

      Still waiting to schedule task
      Waiting for next available executor

      In fact, in the docker cloud agent logs, I can find:

      com.github.dockerjava.api.exception.InternalServerErrorException: Status 500: {"message":"unable to find user docker run --user myuser: no matching entries in passwd file"}
      
      	at PluginClassLoader for docker-java-api//com.github.dockerjava.core.DefaultInvocationBuilder.execute(DefaultInvocationBuilder.java:247)
      	at PluginClassLoader for docker-java-api//com.github.dockerjava.core.DefaultInvocationBuilder.post(DefaultInvocationBuilder.java:102)
      	at PluginClassLoader for docker-java-api//com.github.dockerjava.core.exec.StartContainerCmdExec.execute(StartContainerCmdExec.java:31)
      	at PluginClassLoader for docker-java-api//com.github.dockerjava.core.exec.StartContainerCmdExec.execute(StartContainerCmdExec.java:13)
      	at PluginClassLoader for docker-java-api//com.github.dockerjava.core.exec.AbstrSyncDockerCmdExec.exec(AbstrSyncDockerCmdExec.java:21)
      	at PluginClassLoader for docker-java-api//com.github.dockerjava.core.command.AbstrDockerCmd.exec(AbstrDockerCmd.java:33)
      	at PluginClassLoader for docker-java-api//com.github.dockerjava.core.command.StartContainerCmdImpl.exec(StartContainerCmdImpl.java:42)
      	at PluginClassLoader for docker-plugin//com.nirima.jenkins.plugins.docker.DockerTemplate.doProvisionNode(DockerTemplate.java:755)
      	at PluginClassLoader for docker-plugin//com.nirima.jenkins.plugins.docker.DockerTemplate.provisionNode(DockerTemplate.java:686)
      	at PluginClassLoader for docker-plugin//com.nirima.jenkins.plugins.docker.DockerCloud$1.run(DockerCloud.java:414)
      	at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
      	at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:68)
      	at jenkins.util.ErrorLoggingExecutorService.lambda$wrap$0(ErrorLoggingExecutorService.java:51)
      	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
      	at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
      	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
      	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
      	at java.base/java.lang.Thread.run(Unknown Source) 

      And on Nodes logs:

       

       

      Connecting to docker container 28b538b5567c60dff1f4aa1e27b0a4d197917ec49b02834dbf19e990ecd277c0, running command java -jar /my/path/remoting-3248.3250.v3277a_8e88c9b_.jar -noReconnect -noKeepAlive -agentLog /my/path/agent.log
      HTTP/1.1 101 UPGRADED
      Content-Type: application/vnd.docker.raw-stream
      Connection: Upgrade
      Upgrade: tcp
      Api-Version: 1.41
      Docker-Experimental: false
      Ostype: linux
      Server: Docker/20.10.23 (linux)
      ERROR: Unexpected error in launching an agent. This is probably a bug in Jenkins
      Also:   java.lang.Throwable: launched here
          at hudson.slaves.SlaveComputer._connect(SlaveComputer.java:286)
          at hudson.model.Computer.connect(Computer.java:451)
          at PluginClassLoader for docker-plugin//com.nirima.jenkins.plugins.docker.strategy.DockerOnceRetentionStrategy.start(DockerOnceRetentionStrategy.java:145)
          at PluginClassLoader for docker-plugin//com.nirima.jenkins.plugins.docker.strategy.DockerOnceRetentionStrategy.start(DockerOnceRetentionStrategy.java:49)
          at hudson.model.AbstractCIBase.createNewComputerForNode(AbstractCIBase.java:192)
          ... 

      Note the "This is probably a bug in Jenkins"...

       

      What I've tried

      1. Running  the container as root (so deleting the 7th and 8th row on the Dockerfile), and it is provisioned successfully.
      2. Running the container as 'nobody' (by typing the user in Jenkins cloud agent UI settings, Docker Agent templates -> Container settings -> User; or Docker Agent templates -> Connect method -> User), and the behavior is the same as when I run the cloud agent with user "myuser".
      (user nodoby exists in /etc/group, I can see it when i simply run my container on terminal with docker run command)

          [JENKINS-73990] Docker cloud agent as non-root unable to be provisioned

          Mark Waite added a comment -

          The Jenkins project stopped supporting Red Hat Enterprise Linux 7 and its derivatives (like Oracle Linux 7) in November 2023 as announced in a blog post. Does the same failure happen with Oracle Linux 8 or Oracle Linux 9?

          The request seems to be more about asking for help to diagnose an agent startup issue. You may find more people will read your request for diagnostic help if you post on https://community.jenkins.io or if you try the Jenkins user mailing list. More people read those locations than read individual issue reports here.

          If you believe that this is a Jenkins bug, please provide the additional details requested in "How to report an issue". That increases the chances that someone will volunteer their time to try to duplicate the issue that you are seeing.

          Mark Waite added a comment - The Jenkins project stopped supporting Red Hat Enterprise Linux 7 and its derivatives (like Oracle Linux 7) in November 2023 as announced in a blog post . Does the same failure happen with Oracle Linux 8 or Oracle Linux 9? The request seems to be more about asking for help to diagnose an agent startup issue. You may find more people will read your request for diagnostic help if you post on https://community.jenkins.io or if you try the Jenkins user mailing list. More people read those locations than read individual issue reports here. If you believe that this is a Jenkins bug, please provide the additional details requested in "How to report an issue" . That increases the chances that someone will volunteer their time to try to duplicate the issue that you are seeing.

          Mark Waite added a comment -

          Closing after a month with no further information to show that the issue is a Jenkins bug rather than a configuration issue in the agent definition.

          Mark Waite added a comment - Closing after a month with no further information to show that the issue is a Jenkins bug rather than a configuration issue in the agent definition.

            ericcitaire Eric Citaire
            giuliano Giuliano
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: