Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-38438

image.inside { ... } does not create full configured user account

      The inside command shares the underlying filesystem with wokspace running docker run with --user option so the started processes inside are likely owned by the same uid/gid as their working directory. This causes problems to system tools that do not expect to run with uid/gid that does not exists in the system and its environment is no fully configured. This causes hard to debug failures as few tool expect that current user can not be looked up in /etc/passwd or the process do not have HOME variable set. Reportedly, it is not even POSIX compliant.

      It does not take long to find a container/tool combination that will blow up with artificial uid/gid:

      $ docker run -i fedora dnf update
        ... # OK
      $ docker run -u 4242:4242 -i fedora dnf update
      Traceback (most recent call last):
        File "/usr/bin/dnf", line 36, in <module>
          main.user_main(sys.argv[1:], exit_code=True)
        File "/usr/lib/python2.7/site-packages/dnf/cli/main.py", line 185, in user_main
          errcode = main(args)
        File "/usr/lib/python2.7/site-packages/dnf/cli/main.py", line 84, in main
          return _main(base, args)
        File "/usr/lib/python2.7/site-packages/dnf/cli/main.py", line 115, in _main
          cli.configure(map(ucd, args))
        File "/usr/lib/python2.7/site-packages/dnf/cli/cli.py", line 932, in configure
          self.read_conf_file(opts.conffile, root, releasever, overrides)
        File "/usr/lib/python2.7/site-packages/dnf/cli/cli.py", line 1043, in read_conf_file
          conf.prepend_installroot(opt)
        File "/usr/lib/python2.7/site-packages/dnf/yum/config.py", line 718, in prepend_installroot
          path = path.lstrip('/')
      AttributeError: 'NoneType' object has no attribute 'lstrip'
      

      The message does not indicate the real cause at all.

          [JENKINS-38438] image.inside { ... } does not create full configured user account

          It doesn't seem related. For me it is the same problem as described. "dotnet restore" expects to have a valid $HOME folder

          Sylvain Rollinet added a comment - It doesn't seem related. For me it is the same problem as described. "dotnet restore" expects to have a valid $HOME folder

          Right, this is just because -u is passed.  dotnet restore wants a nuget cache.  I think that this plugin should not pass -u by default.  inside(...) can be used to pass it if necessary.

          In our case we have additional infrastructure around our containers so we're not using docker.build.

          Matthew Mitchell added a comment - Right, this is just because -u is passed.  dotnet restore wants a nuget cache.  I think that this plugin should not pass -u by default.  inside(...) can be used to pass it if necessary. In our case we have additional infrastructure around our containers so we're not using docker.build.

          Oliver Gondža added a comment - - edited

          mmitche, but otherwise you might not have sufficient permissions in current folder which is likely to cause problems even more often. EDIT, When I think of it the user will likely default to root creating files that can not be manipulated once you leave the inside step.

          Oliver Gondža added a comment - - edited mmitche , but otherwise you might not have sufficient permissions in current folder which is likely to cause problems even more often. EDIT, When I think of it the user will likely default to root creating files that can not be manipulated once you leave the inside step.

          olivergondza We can make sure we clean up the container before exiting.  The issue here is that these are containers shared among different systems and teams, not built on the fly with docker.build.  So I don't have control over available users in the container.  The default for docker is to run as root within the container, so we're going to stick with that for now, unless you have another suggestion for how we should be managing use cases like this.

          I did find that you can pass -u 0:0 to the inside() step to get the desired behavior.

          Matthew Mitchell added a comment - olivergondza We can make sure we clean up the container before exiting.  The issue here is that these are containers shared among different systems and teams, not built on the fly with docker.build.  So I don't have control over available users in the container.  The default for docker is to run as root within the container, so we're going to stick with that for now, unless you have another suggestion for how we should be managing use cases like this. I did find that you can pass -u 0:0 to the inside() step to get the desired behavior.

          Jesse Glick added a comment -

          Do not plan to change the current behavior here.

          Jesse Glick added a comment - Do not plan to change the current behavior here.

          Jason MCDev added a comment -

          I am really struggling with this. We have existing builds that run inside of Docker.  We set all of our environment variables in the Dockerfile - LD_LIBRARY_PATH, PATH, etc.

          When I run this image in docker, I have none of those variable.  I have been trying to work around this for days - what is the expected workflow for this situation?  Please help? 

          Jason MCDev added a comment - I am really struggling with this. We have existing builds that run inside of Docker.  We set all of our environment variables in the Dockerfile - LD_LIBRARY_PATH, PATH, etc. When I run this image in docker, I have none of those variable.  I have been trying to work around this for days - what is the expected workflow for this situation?  Please help? 

          Jesse Glick added a comment -

          jsnmcdev do not use Image.inside for such cases. Write a per-project Dockerfile performing your build stuff and run it directly with docker commands.

          Jesse Glick added a comment - jsnmcdev do not use Image.inside for such cases. Write a per-project Dockerfile performing your build stuff and run it directly with docker commands.

          Eric Tan added a comment -

          Observations:

          uid 1000 and gid 1000 are not associated with actual names like jenkins, so 'whoami' will not work.

          Also running "pip install --user module " will fail since pip writes some objects in the user's own home directory (<user_home>/pip-stuff). Since there is no name associated with uid 1000, pip will write to the root directory instead (/pip-stuff), which it does not have permissions.

          Jason is right to suggest doing all build stuff in the Dockerfile.

          Eric Tan added a comment - Observations: uid 1000 and gid 1000 are not associated with actual names like jenkins, so 'whoami' will not work. Also running "pip install --user module " will fail since pip writes some objects in the user's own home directory (<user_home>/pip-stuff). Since there is no name associated with uid 1000, pip will write to the root directory instead (/pip-stuff), which it does not have permissions. Jason is right to suggest doing all build stuff in the Dockerfile.

          Hi!

          Is this issue planned to be resolved? I was stuck with the following error on a Maven build and I had no clue what it was:

          The specified user settings file does not exist: /path/on/jenkins/host/?/.m2/settings.xml

          I think it's a very important improvement. Other CI/CD tools do this when working with Docker to avoid such problems...

          Maybe I can try to send you a pull request with some guidance. In which class should I implement this?

          Rodrigo Carvalho Silva added a comment - Hi! Is this issue planned to be resolved? I was stuck with the following error on a Maven build and I had no clue what it was: The specified user settings file does not exist: /path/on/jenkins/host/?/.m2/settings.xml I think it's a very important improvement. Other CI/CD tools do this when working with Docker to avoid such problems... Maybe I can try to send you a pull request with some guidance. In which class should I implement this?

          Cristian added a comment - - edited

          Really I am not sure how this could be expected to work. Typically the Jenkins agent will be running as a jenkins user or similar, and so files in the workspace need to be owned by that user so they can be read and written by other steps (or even deleted by the next build, etc.). Since the --volume option offers no apparent UID mapping feature, that means that the active processes in the container must use the same UID/GID as the agent.

          Fast-forward to 2021 and nowadays we have user namespaces, specially with podman.

          it is expected that there are two kinds of commands: tool/environment setup, which should be done using build with a Dockerfile, thus cached and able to run as any USER; and actual build steps, which are focused on manipulating workspace files and should run without privilege. If your build is really “about” creating an image or otherwise configuring things at the OS level, inside is inappropriate—you probably want to use only build and maybe run/withRun. Probably this usage distinction needs to be more clearly documented.

          I don't think the documentation was ever changed. If it helps somebody

          I have a Dockerfile like this

          ARG BASE_IMAGE
          FROM ${BASE_IMAGE}
          
          ARG USER_ID
          ARG GROUP_ID
          RUN groupadd -g ${GROUP_ID} builder
          RUN useradd -u ${USER_ID} -g ${GROUP_ID} builder
          

          And my Jenkinsfile says

          agent {
            dockerfile {
              additionalBuildArgs "--build-arg BASE_IMAGE=<the_image_I_want_to_use> --build-arg USER_ID=\$(id -u) --build-arg GROUP_ID=\$(id -g)"
            }
          }
          

          when it used to say

          agent {
            docker {
              image "<the_image_I_want_to_use>"
            }
          }
          

          Cristian added a comment - - edited Really I am not sure how this could be expected to work. Typically the Jenkins agent will be running as a jenkins user or similar, and so files in the workspace need to be owned by that user so they can be read and written by other steps (or even deleted by the next build, etc.). Since the --volume option offers no apparent UID mapping feature, that means that the active processes in the container must use the same UID/GID as the agent. Fast-forward to 2021 and nowadays we have user namespaces, specially with podman. it is expected that there are two kinds of commands: tool/environment setup, which should be done using build with a Dockerfile , thus cached and able to run as any USER ; and actual build steps, which are focused on manipulating workspace files and should run without privilege. If your build is really “about” creating an image or otherwise configuring things at the OS level, inside is inappropriate—you probably want to use only build and maybe run / withRun . Probably this usage distinction needs to be more clearly documented. I don't think the documentation was ever changed. If it helps somebody I have a Dockerfile like this ARG BASE_IMAGE FROM ${BASE_IMAGE} ARG USER_ID ARG GROUP_ID RUN groupadd -g ${GROUP_ID} builder RUN useradd -u ${USER_ID} -g ${GROUP_ID} builder And my Jenkinsfile says agent { dockerfile { additionalBuildArgs "--build-arg BASE_IMAGE=<the_image_I_want_to_use> --build-arg USER_ID=\$(id -u) --build-arg GROUP_ID=\$(id -g)" } } when it used to say agent { docker { image "<the_image_I_want_to_use>" } }

            Unassigned Unassigned
            olivergondza Oliver Gondža
            Votes:
            11 Vote for this issue
            Watchers:
            24 Start watching this issue

              Created:
              Updated: