Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Not A Defect
-
ubunut x64 as master
win10x64 as slave
Description
The jar file on the slave fails to start.
I use "Launch Slave Agents via SSH"
[04/06/18 15:32:05] [SSH] Opening SSH connection to 192.168.120.187:22. [04/06/18 15:32:06] [SSH] SSH host key matches key seen previously for this host. Connection will be allowed. [04/06/18 15:32:06] [SSH] Authentication successful. [04/06/18 15:32:06] [SSH] The remote user's environment is: [04/06/18 15:32:06] [SSH] Checking java version of java [04/06/18 15:32:06] [SSH] java -version returned 1.8.0_161. [04/06/18 15:32:06] [SSH] Starting sftp client. [04/06/18 15:32:06] [SSH] Copying latest slave.jar... [04/06/18 15:32:06] [SSH] Copied 762,466 bytes. Expanded the channel window size to 4MB [04/06/18 15:32:06] [SSH] Starting slave process: cd "D:\CI\jenkins" && java -jar slave.jar <===[JENKINS REMOTING CAPACITY]===>Slave JVM has terminated. Exit code=0 [04/06/18 15:32:06] Launch failed - cleaning up connection [04/06/18 15:32:06] [SSH] Connection closed.
I use Manually trusted key Verification Strategy
Require manual verification of initinial connection is OFF
On windows, I use cygwin as bash, and openSSH from microsoft.
Java version master:
java version "1.8.0_144" Java(TM) SE Runtime Environment (build 1.8.0_144-b01) Java HotSpot(TM) 64-Bit Server VM (build 25.144-b01, mixed mode)
Java version slave:
java version "1.8.0_161" Java(TM) SE Runtime Environment (build 1.8.0_161-b12) Java HotSpot(TM) Client VM (build 25.161-b12, mixed mode, sharing)
The jenkins version is: Jenkins ver. 2.107.2
Attachments
Issue Links
- is related to
-
JENKINS-42856 Unable to launch Windows slaves using Microsoft OpenSSH: Unexpected termination of the channel
-
- Fixed but Unreleased
-
Activity
zack The output says that slave.jar successfully finished. In order to diagnose the issue, we need its logs.
There are several ways to get these logs, see https://speakerdeck.com/onenashev/day-of-jenkins-2017-dealing-with-agent-connectivity-issues?slide=51
According to the available evidence, it seems to be something related to Remoting or SSH Slaves plugin
I tried adding as "Suffix Start Slave Command"
following:
-slaveLog "D:/CI/jenkins-ssh/log.txt" -slaveLog="D:/CI/jenkins-ssh/log.txt" -logFile "D:/CI/jenkins-ssh/log.txt" -logFile="D:/CI/jenkins-ssh/log.txt" > "D:/CI/jenkins-ssh/log.txt" 2>"D:/CI/jenkins-ssh/Errorlog.txt"
However, not log was created.
Remark the white space at the start.
But the commands where used, I get following output from the agent:
[04/12/18 12:03:02] [SSH] Starting slave process: cd "D:/CI/jenkins-ssh" && java -jar slave.jar -slaveLog="D:/CI/jenkins-ssh/log.txt"
In my sshd.log file from my ssh server on windows (where I want to run the slave) I get following output:
5436 2018-04-12 11:43:08.739 WARNING: could not open __PROGRAMDATA__\\ssh/moduli (No such file or directory), using fixed modulus 5436 2018-04-12 11:43:08.989 Accepted password for BuildUser from 192.168.115.188 port 59704 ssh2 7648 2018-04-12 11:43:09.713 Received disconnect from 192.168.115.188 port 59704:11: Closed due to user request. 7648 2018-04-12 11:43:09.713 Disconnected from 192.168.115.188 port 59704
192.168.115.188 => jenkins master
I am not sure that Java correctly processes the "D:/CI/jenkins-ssh/log.txt" format in Windows. Please try with correct backward slashes.
Since the Machine is Windows. Not all SSH Servers are supported by the SSH slaves plugin AFAIK. I know that OpenSSH works in some cases, but I have never tested it on my own
oleg_nenashev
Normally on bash based script, its only allowed to use forward slahses, however, I also tried with back slashes and with quotes and without.
Nothing.
Since it's confirmed that JNLP agents manage to connect, I assume this is an issue in SSH Slaves plugin. Reassigned it to the plugin's maintainer
I saw that you have installed a JDK 32 bits on a 64 bits host, Could you install a 64 bit JDK in the agent?
java version "1.8.0_161" Java(TM) SE Runtime Environment (build 1.8.0_161-b12) Java HotSpot(TM) Client VM (build 25.161-b12, mixed mode, sharing)
I update the java version on the windows agent:
C:\Users\BuildUser>java -version java version "1.8.0_161" Java(TM) SE Runtime Environment (build 1.8.0_161-b12) Java HotSpot(TM) 64-Bit Server VM (build 25.161-b12, mixed mode)
However, the error is still there.
04/17/18 14:39:54] [SSH] Opening SSH connection to 192.168.120.187:22. [04/17/18 14:39:54] [SSH] SSH host key matches key seen previously for this host. Connection will be allowed. [04/17/18 14:39:54] [SSH] Authentication successful. [04/17/18 14:39:54] [SSH] The remote user's environment is: [04/17/18 14:39:54] [SSH] Checking java version of java [04/17/18 14:39:54] [SSH] java -version returned 1.8.0_161. [04/17/18 14:39:54] [SSH] Starting sftp client. [04/17/18 14:39:54] [SSH] Copying latest slave.jar... [04/17/18 14:39:55] [SSH] Copied 762,466 bytes. Expanded the channel window size to 4MB [04/17/18 14:39:55] [SSH] Starting slave process: cd "D:\CI\jenkins-ssh" && java -jar slave.jar -slaveLog=D:\CI\jenkins-ssh\log.txt <===[JENKINS REMOTING CAPACITY]===>Slave JVM has terminated. Exit code=0 [04/17/18 14:39:55] Launch failed - cleaning up connection [04/17/18 14:39:55] [SSH] Connection closed.
Is there a support folder in D:\CI\jenkins-ssh?
Are there log files in D:\CI\jenkins-ssh?
Are there hs_err_* files in D:\CI\jenkins-ssh?
Is java in the path of the user that you use when you enter with ssh?
Could you attach the config.xml of the agent? JENKINS_URL/computer/AGENT_NAME/config.xml
How much memory this Agent has?
Could you attach the D:/CI/jenkins-ssh/log.txt file?
exetute this command to see the version of sshd you use
sshd --foo
$ sshd --foo sshd: unknown option -- - OpenSSH_7.3p1, OpenSSL 1.0.2j 26 Sep 2016 usage: sshd [-46DdeiqTt] [-b bits] [-C connection_spec] [-c host_cert_file] [-E log_file] [-f config_file] [-g login_grace_time] [-h host_key_file] [-k key_gen_time] [-o option] [-p port] [-u len]
There are no other files in the folder: D:\CI\jenkins-ssh, except the "slave.jar"
When I try to download: http://jenkins/computer/windows-ssh/config.xml
I get the attached XML (config.zip)
When using cygwin as bash, I get also the environment variables printed:
[04/17/18 17:58:59] [SSH] Opening SSH connection to windows-build-server:22. [04/17/18 17:58:59] [SSH] SSH host key matches key seen previously for this host. Connection will be allowed. [04/17/18 17:58:59] [SSH] Authentication successful. [04/17/18 17:58:59] [SSH] The remote user's environment is: ALLUSERSPROFILE='C:\ProgramData' APPDATA='C:\Users\BuildUser\AppData\Roaming' BASH=/usr/bin/bash BASHOPTS=cmdhist:complete_fullquote:extquote:force_fignore:hostcomplete:interactive_comments:progcomp:promptvars:sourcepath BASH_ALIASES=() BASH_ARGC=() BASH_ARGV=() BASH_CMDS=() BASH_EXECUTION_STRING=set BASH_LINENO=() BASH_SOURCE=() BASH_VERSINFO=([0]="4" [1]="4" [2]="12" [3]="3" [4]="release" [5]="x86_64-unknown-cygwin") BASH_VERSION='4.4.12(3)-release' COMMONPROGRAMFILES='C:\Program Files\Common Files' COMPUTERNAME=W10X64-BUILDSRV COMSPEC='C:\WINDOWS\system32\cmd.exe' CYG_SYS_BASHRC=1 CommonProgramW6432='C:\Program Files\Common Files' DIRSTACK=() DXSDK_DIR='C:\Program Files (x86)\Microsoft DirectX SDK (June 2010)\' EUID=197611 GROUPS=() HOME=/home/BuildUser HOMEDRIVE=C: HOMEPATH='\Users\BuildUser' HOSTNAME=W10x64-BuildSrv HOSTTYPE=x86_64 IFS=$' \t\n' LOCALAPPDATA='C:\Users\BuildUser\AppData\Local' MACHTYPE=x86_64-unknown-cygwin NUMBER_OF_PROCESSORS=4 OPTERR=1 OPTIND=1 OS=Windows_NT OSTYPE=cygwin OneDrive='C:\Users\BuildUser\OneDrive' PATH='/cygdrive/c/ProgramData/Oracle/Java/javapath:/cygdrive/c/WINDOWS/system32:/cygdrive/c/WINDOWS:/cygdrive/c/WINDOWS/System32/Wbem:/cygdrive/c/WINDOWS/System32/WindowsPowerShell/v1.0:/cygdrive/c/Program Files (x86)/Windows Kits/8.1/Windows Performance Toolkit:/cygdrive/c/Program Files/Microsoft SQL Server/110/Tools/Binn:/cygdrive/c/Program Files (x86)/Microsoft SDKs/TypeScript/1.0:/cygdrive/c/Program Files/Microsoft SQL Server/120/Tools/Binn:/cygdrive/c/Program Files/Git/cmd:/cygdrive/c/Program Files/SafeNet/Authentication/SAC/x64:/cygdrive/c/Program Files/SafeNet/Authentication/SAC/x32:/cygdrive/c/WINDOWS/SysWOW64/WindowsPowerShell/v1.0/Modules/TShell/TShell:/cygdrive/c/WINDOWS/system32/config/systemprofile/AppData/Local/Microsoft/WindowsApps:/cygdrive/c/Users/BuildUser/AppData/Local/Microsoft/WindowsApps' PATHEXT='.COM;.EXE;.BAT;.CMD;.VBS;.VBE;.JS;.JSE;.WSF;.WSH;.MSC' PIPESTATUS=([0]="0") PPID=1 PROCESSOR_ARCHITECTURE=AMD64 PROCESSOR_IDENTIFIER='Intel64 Family 6 Model 60 Stepping 3, GenuineIntel' PROCESSOR_LEVEL=6 PROCESSOR_REVISION=3c03 PROGRAMFILES='C:\Program Files' PROMPT='builduser@W10X64-BUILDSRV $P$G' PS4='+ ' PSModulePath='C:\Program Files\WindowsPowerShell\Modules;C:\WINDOWS\system32\WindowsPowerShell\v1.0\Modules' PUBLIC='C:\Users\Public' PWD=/cygdrive/c/Users/BuildUser ProgramData='C:\ProgramData' ProgramW6432='C:\Program Files' SHELL=/bin/bash SHELLOPTS=braceexpand:hashall:interactive-comments SHLVL=1 SSH_CLIENT='192.168.115.188 59638 22' SSH_CONNECTION='192.168.115.188 59638 192.168.120.187 22' SYSTEMDRIVE=C: SYSTEMROOT='C:\WINDOWS' TEMP=/cygdrive/c/Users/BuildUser/AppData/Local/Temp TERM=cygwin TMP=/cygdrive/c/Users/BuildUser/AppData/Local/Temp UID=197611 USERDOMAIN=WORKGROUP USERNAME=builduser USERPROFILE='C:\Users\BuildUser' VS110COMNTOOLS='C:\Program Files (x86)\Microsoft Visual Studio 11.0\Common7\Tools\' VS120COMNTOOLS='C:\Program Files (x86)\Microsoft Visual Studio 12.0\Common7\Tools\' WINDIR='C:\WINDOWS' _= [04/17/18 17:59:00] [SSH] Checking java version of java [04/17/18 17:59:00] [SSH] java -version returned 1.8.0_161. [04/17/18 17:59:00] [SSH] Starting sftp client. [04/17/18 17:59:00] [SSH] Copying latest slave.jar... [04/17/18 17:59:00] [SSH] Copied 762,466 bytes. Expanded the channel window size to 4MB [04/17/18 17:59:00] [SSH] Starting slave process: cd "D:\CI\jenkins-ssh" && java -jar slave.jar -slaveLog=D:\CI\jenkins-ssh\log.txt <===[JENKINS REMOTING CAPACITY]===>Slave JVM has terminated. Exit code=0 [04/17/18 17:59:00] Launch failed - cleaning up connection [04/17/18 17:59:00] [SSH] Connection closed.
Create a file named D:\CI\jenkins-ssh\logging.properties and add this property to the JVM options *-Djava.util.logging.config.file=/cygdrive/d/CI/jenkins-ssh/logging.properties* also set the *root fs* to /cygdrive/d/CI/jenkins-ssh/ in the Agent configuration
# -Djava.util.logging.config.file=logging.properties .level = ALL handlers= java.util.logging.FileHandler java.util.logging.FileHandler.level = ALL java.util.logging.FileHandler.formatter = java.util.logging.SimpleFormatter java.util.logging.FileHandler.pattern=jenkins-ssh-agent-%u.log java.util.logging.FileHandler.limit = 10000000 java.util.logging.FileHandler.count = 10 javax.jms.connection.level = INFO hudson.level = INFO hudson.remoting.Channel.level = FINE hudson.remoting.FileSystemJarCache.level = INFO hudson.remoting.jnlp.level = FINE hudson.remoting.RemoteClassLoader.level = INFO jenkins.slaves.level = FINE hudson.slaves.level = FINE org.jenkinsci.remoting.engine.level = FINE jenkins.AgentProtocol.level = FINE
The config file attached is not completed, it is a copy and paste of an parse error in your browser, please download the file and attach it.
if it does not generate any files on D:\CI\jenkins-ssh try to execute the slave.jar process manually to see if we see something else, follow these steps
- enter on the Agent by SSH with the same user you use in Jenkins
- execure "D:\CI\jenkins-ssh" && java -jar slave.jar -slaveLog=D:\CI\jenkins-ssh\log.txt"
- Copy the output and paste it here
Here is my output of the console (logged in from the master agent, where jenkins is running):
BuildUser@W10x64-BuildSrv /cygdrive/c/Users/BuildUser $ D: bash: D:: command not found BuildUser@W10x64-BuildSrv /cygdrive/c/Users/BuildUser $ cd d: BuildUser@W10x64-BuildSrv /cygdrive/d $ cd CI/jenkins-ssh/ BuildUser@W10x64-BuildSrv /cygdrive/d/CI/jenkins-ssh $ d: bash: d:: command not found BuildUser@W10x64-BuildSrv /cygdrive/d/CI/jenkins-ssh $ cd .. BuildUser@W10x64-BuildSrv /cygdrive/d/CI $ cd .. BuildUser@W10x64-BuildSrv /cygdrive/d $ cd C: BuildUser@W10x64-BuildSrv /cygdrive/c $ cls bash: cls: command not found BuildUser@W10x64-BuildSrv /cygdrive/c $ clear bash: clear: command not found BuildUser@W10x64-BuildSrv /cygdrive/c $ cls bash: cls: command not found BuildUser@W10x64-BuildSrv /cygdrive/c $ "D:\CI\jenkins-ssh" && java -jar slave.jar -slaveLog=D:\CI\jenkins-ssh\log.txt" > pwd > ls > BuildUser@W10x64-BuildSrv /cygdrive/c $ d: bash: d:: command not found BuildUser@W10x64-BuildSrv /cygdrive/c $ cd D: BuildUser@W10x64-BuildSrv /cygdrive/d $ cd CI/jenkins BuildUser@W10x64-BuildSrv /cygdrive/d/CI/jenkins $ java -jar slave.jar -slaveLog=D:\CI\jenkins-ssh\log.txt" > BuildUser@W10x64-BuildSrv /cygdrive/d/CI/jenkins $ java -jar slave.jar WARNING: Are you running agent from an interactive console? If so, you are probably using it incorrectly. See https://wiki.jenkins.io/display/JENKINS/Launching+agent+from+console <===[JENKINS REMOTING CAPACITY]===>rO0ABXNyABpodWRzb24ucmVtb3RpbmcuQ2FwYWJpbGl0eQAAAAAAAAABAgABSgAEbWFza3hwAAAAAAAAAP4=
Only when I execute java.jar directly, (like you can see on the last cmd) I could start the agent, otherwise it was not returning from execution.
See attached now the correct config.xml
>Only when I execute java.jar directly, (like you can see on the last cmd) I could start the agent, otherwise it was not returning from execution.
That it is ok, it should not return, I wanted to check that you can execute the java -jar slave.jar and it does not return errors.
The config.xml attached is a multibranch project, it is not the config.xml of the Agent, anyway add this command as suffix start label command
|| echo "KO - retcode $?"
Then do a try an attach the output.
Finally, replace the previous command by this one and attach the output
&& echo "OK - retcode $?"
I want to get the real exit code of the java process
How much memory has this agent? Do it has an antivirus installed?
You mean like this?
[04/18/18 18:15:35] [SSH] Starting slave process: cd "/cygdrive/d/CI/jenkins-ssh" && java -Djava.util.logging.config.file=/cygdrive/d/CI/jenkins-ssh/logging.properties -jar slave.jar && echo "OK - retcode $?" <===[JENKINS REMOTING CAPACITY]===>Slave JVM has terminated. Exit code=0 [04/18/18 18:15:35] Launch failed - cleaning up connection [04/18/18 18:15:35] [SSH] Connection closed.
I executed the previous command before, nothing.
The agent has 16 GB of RAM
I can also successfully build on the machine, when I use the connection of "Launch agent via web start".
Do you not find it strange, that I could execute the slave.jar file only when I was in the folder?
Could you try the other one? because seems like the slave.jar process die
|| echo "KO - retcode $?"
I think that I know what happens, clean up the suffix and prefix field and put exactly this one in the prefix field
/bin/bash -c "cd "/cygdrive/d/CI/jenkins-ssh" && java -Djava.util.logging.config.file=/cygdrive/d/CI/jenkins-ssh/logging.properties -jar slave.jar";
Here is the output:
prefix_output.txt
I just inserted text into the prefix field.
ifernandezcalvo
Any update? Or are you now at the end of you knowledge? A tricky problem.
/bin/bash -c "cd "/cygdrive/d/CI/jenkins-ssh" && java -Djava.util.logging.config.file=/cygdrive/d/CI/jenkins-ssh/logging.properties -jar slave.jar"
The command line is not correct but does not matter because "/bin/bash -c" does not return any error, it should return at least a syntax error, I do not know what happen on your environment but it is not related with the SSH Slave Plugins it is something in your sshd configuration, on your user login, on your environment initialization files (.bashrc, .profile, ...), or your default shell, I bet that if you put
/bin/bash -c ls &&
on your command prefix you would see nothing in the console output.
You could try to execute a command using the ssh command line, probably it fails
ssh USERNAME@AGENT_HOST /bin/bash -c ls
try the same but with -t parameter, it should work
ssh -t USERNAME@AGENT_HOST /bin/bash -c ls
I recommend you to check a really good comment from Ben Langton at https://wiki.jenkins.io/display/JENKINS/SSH+slaves+and+Cygwin that explains all the steps that you have to make to configure OpenSSH.
If you need more help, try in the jenkins user group.
dnusbaum
Any update here? I cannot use jenkins because of this.