Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-45755

Unable to launch SSH Slave since 2.68 when HOME is not writable on Master

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Critical Critical
    • Single master and single slave setup, both running Debian and using SSH slave where master user is not the same as the slave user. Jenkins 2.71 / Remoting 3.10

      Last working for me Jenkins version 2.67

      Versions not working 2.68 - 2.78

      Issue: SSH Slave will not launch

      Some part of the release in 2.68 seems to have changed how the environment is setup for slaves. The user my slave connects as has a home directory of: /home/SPALDING/jenkinsbuildserver however I can see in the error message that jarCache is attempting to look in master user's home directory which is /usr/share/tomcat8, which doesn't exist on the slave.

      I figured I could at least work around this issue if I set the new "workDir" parameter, but the launch will still fail. Even if I create the /usr/share/tomcat8/.jenkins/cache/jars directory and ensure it is writable by the slave user,  it still fails to launch.

      There appears to be 2 issues:

      • The slave is not using the correct location for the default jar cache because it seems to be using the home directory of the master user.
      • Even when the 'workDir' parameter is specified, Jenkins is still trying to validate the default location.
      [07/24/17 10:42:11] [SSH] Opening SSH connection to SLAVEHOST:22.
      [07/24/17 10:42:11] [SSH] WARNING: SSH Host Keys are not being verified. Man-in-the-middle attacks may be possible against this connection.
      [07/24/17 10:42:11] [SSH] Authentication successful.
      [07/24/17 10:42:11] [SSH] The remote users environment is:
      BASH=/bin/bash
      BASHOPTS=cmdhist:complete_fullquote:extquote:force_fignore:hostcomplete:interactive_comments:progcomp:promptvars:sourcepath
      BASH_ALIASES=()
      BASH_ARGC=()
      BASH_ARGV=()
      BASH_CMDS=()
      BASH_EXECUTION_STRING=set
      BASH_LINENO=()
      BASH_SOURCE=()
      BASH_VERSINFO=([0]="4" [1]="3" [2]="30" [3]="1" [4]="release" [5]="x86_64-pc-linux-gnu")
      BASH_VERSION='4.3.30(1)-release'
      DIRSTACK=()
      EUID=10169
      GROUPS=()
      HOME=/home/SPALDING/jenkinsbuildserver
      HOSTNAME=SLAVEHOST
      HOSTTYPE=x86_64
      IFS=$' \t\n'
      LANG=en_US.UTF-8
      LOGNAME=jenkinsbuildserver
      MACHTYPE=x86_64-pc-linux-gnu
      MAIL=/var/mail/jenkinsbuildserver
      OPTERR=1
      OPTIND=1
      OSTYPE=linux-gnu
      PATH=/usr/local/bin:/usr/bin:/bin:/usr/games
      PIPESTATUS=([0]="0")
      PPID=21458
      PS4='+ '
      PWD=/home/SPALDING/jenkinsbuildserver
      SHELL=/bin/bash
      SHELLOPTS=braceexpand:hashall:interactive-comments
      SHLVL=1
      SSH_CLIENT='10.10.1.179 57938 22'
      SSH_CONNECTION='10.10.1.179 57938 10.10.0.251 22'
      TERM=dumb
      UID=10169
      USER=jenkinsbuildserver
      _=']'
      [07/24/17 10:42:11] [SSH] Checking java version of java
      [07/24/17 10:42:11] [SSH] java -version returned 1.8.0_131.
      [07/24/17 10:42:11] [SSH] Starting sftp client.
      [07/24/17 10:42:11] [SSH] Copying latest slave.jar...
      [07/24/17 10:42:11] [SSH] Copied 730,299 bytes.
      Expanded the channel window size to 4MB
      [07/24/17 10:42:11] [SSH] Starting slave process: cd "/home/SPALDING/jenkinsbuildserver/jenkins-agent" && java  -jar slave.jar -workDir /home/SPALDING/jenkinsbuildserver/jenkins-agent/work -failIfWorkDirIsMissing
      Jul 24, 2017 10:42:12 AM org.jenkinsci.remoting.engine.WorkDirManager initializeWorkDir
      INFO: Using /home/SPALDING/jenkinsbuildserver/jenkins-agent/work/remoting as a remoting work directory
      Both error and output logs will be printed to /home/SPALDING/jenkinsbuildserver/jenkins-agent/work/remoting
      <===[JENKINS REMOTING CAPACITY]===>ERROR: Unexpected error in launching a slave. This is probably a bug in Jenkins.
      java.lang.RuntimeException: Root directory not writable: /usr/share/tomcat8/.jenkins/cache/jars
              at hudson.remoting.FileSystemJarCache.<init>(FileSystemJarCache.java:57)
              at hudson.remoting.JarCache.getDefault(JarCache.java:32)
              at hudson.remoting.Channel.<init>(Channel.java:505)
              at hudson.remoting.ChannelBuilder.build(ChannelBuilder.java:323)
              at hudson.slaves.SlaveComputer.setChannel(SlaveComputer.java:389)
              at hudson.plugins.sshslaves.SSHLauncher.startSlave(SSHLauncher.java:1070)
              at hudson.plugins.sshslaves.SSHLauncher.access$500(SSHLauncher.java:144)
              at hudson.plugins.sshslaves.SSHLauncher$2.call(SSHLauncher.java:817)
              at hudson.plugins.sshslaves.SSHLauncher$2.call(SSHLauncher.java:792)
              at java.util.concurrent.FutureTask.run(FutureTask.java:266)
              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
              at java.lang.Thread.run(Thread.java:748)
      [07/24/17 10:42:12] Launch failed - cleaning up connection
      [07/24/17 10:42:12] [SSH] Connection closed.
      
      
      

      Reproducer using Docker

      Launch Jenkins using the following

      mkdir jenkins_home
      sudo chown 2000 jenkins_home
      sudo chmod 777 jenkins_home
      docker run -ti -v $(pwd)/jenkins_home:/var/jenkins_home  -u 2000 --rm -p 8080:8080 -p 50000:50000 jenkinsci/jenkins:2.73
      

      Create a jnlp agent, then try to connect. You will then get the following sequence:

      master

      Sep 12, 2017 1:06:16 PM hudson.TcpSlaveAgentListener$ConnectionHandler run
      INFO: Accepted JNLP4-connect connection #1 from /172.17.0.1:38552
      Sep 12, 2017 1:06:16 PM org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer onRecv
      WARNING: [JNLP4-connect connection from 172.17.0.1/172.17.0.1:38552]
      java.lang.RuntimeException: Root directory not writable: ?/.jenkins/cache/jars
      	at hudson.remoting.FileSystemJarCache.<init>(FileSystemJarCache.java:57)
      	at hudson.remoting.JarCache.getDefault(JarCache.java:32)
      	at hudson.remoting.Channel.<init>(Channel.java:505)
      	at hudson.remoting.ChannelBuilder.build(ChannelBuilder.java:339)
      	at org.jenkinsci.remoting.protocol.impl.ChannelApplicationLayer.onRead(ChannelApplicationLayer.java:149)
      	at org.jenkinsci.remoting.protocol.ApplicationLayer.onRecv(ApplicationLayer.java:207)
      	at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.onRecv(ProtocolStack.java:669)
      	at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.processRead(SSLEngineFilterLayer.java:369)
      	at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.onRecv(SSLEngineFilterLayer.java:117)
      	at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.onRecv(ProtocolStack.java:669)
      	at org.jenkinsci.remoting.protocol.NetworkLayer.onRead(NetworkLayer.java:136)
      	at org.jenkinsci.remoting.protocol.impl.NIONetworkLayer.ready(NIONetworkLayer.java:160)
      	at org.jenkinsci.remoting.protocol.IOHub$OnReady.run(IOHub.java:721)
      	at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
      	at java.lang.Thread.run(Thread.java:748)
      

      agent

      INFOS: Locating server among [http://localhost:8080/]
      sept. 12, 2017 1:06:16 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver resolve
      INFOS: Remoting server accepts the following protocols: [JNLP4-connect, JNLP-connect, Ping, JNLP2-connect]
      sept. 12, 2017 1:06:16 PM hudson.remoting.jnlp.Main$CuiListener status
      INFOS: Agent discovery successful
        Agent address: localhost
        Agent port:    50000
        Identity:      80:e1:c5:f6:d5:96:cf:1d:6a:58:45:48:2b:fe:67:76
      sept. 12, 2017 1:06:16 PM hudson.remoting.jnlp.Main$CuiListener status
      INFOS: Handshaking
      sept. 12, 2017 1:06:16 PM hudson.remoting.jnlp.Main$CuiListener status
      INFOS: Connecting to localhost:50000
      sept. 12, 2017 1:06:16 PM hudson.remoting.jnlp.Main$CuiListener status
      INFOS: Trying protocol: JNLP4-connect
      sept. 12, 2017 1:06:16 PM hudson.remoting.jnlp.Main$CuiListener status
      INFOS: Remote identity confirmed: 80:e1:c5:f6:d5:96:cf:1d:6a:58:45:48:2b:fe:67:76
      sept. 12, 2017 1:06:16 PM hudson.remoting.jnlp.Main$CuiListener status
      INFOS: Connected
      sept. 12, 2017 1:06:16 PM hudson.remoting.jnlp.Main$CuiListener status
      INFOS: Terminated
      

            oleg_nenashev Oleg Nenashev
            jmccormick Jesse McCormick
            Votes:
            2 Vote for this issue
            Watchers:
            9 Start watching this issue

              Created:
              Updated:
              Resolved: