-
Bug
-
Resolution: Fixed
-
Critical
-
Windows Server 2012 x64
jre1.8.0_241
-
Powered by SuggestiMate
After upgrading to Trilead API v1.0.11 my connection to SSH clients fail with the error below:
[09/27/20 10:23:16] [SSH] WARNING: SSH Host Keys are not being verified. Man-in-the-middle attacks may be possible against this connection.
Key exchange was not finished, connection is closed.
SSH Connection failed with IOException: "Key exchange was not finished, connection is closed.", retrying in 5 seconds. There are 1 more retries left.
[09/27/20 10:23:22] [SSH] WARNING: SSH Host Keys are not being verified. Man-in-the-middle attacks may be possible against this connection.
Key exchange was not finished, connection is closed.
ERROR: Connection is not established!
I have reproduced this on two environments and get exactly the same results, downgrading to v1.0.10 fixes the issue.
- duplicates
-
JENKINS-63794 SSH agent - Key exchange was not finished, connection is closed
-
- Closed
-
- is duplicated by
-
JENKINS-63829 Git SSH connection fails in TriLead KexManager
-
- Closed
-
[JENKINS-63790] Trilead API v1.0.11 causes SSH agent connections to fail
Jenkins 2.258
SSH Build Agents plugin 1.31.2
I don't think I am using a key, I set "Non verifying Verification Strategy".
Do you know the Operating system of your agent and the OpenSSH version installed?
Yes user name and password.
Not sure of the OpenSSH version do you know how I check? The agents are running on a mixture of OS's and they all have the same issue, Ubuntu 18 and 20 and Windows Server 2019 (amd64).
Run the following command in the agent
❯ ssh -V OpenSSH_8.1p1, LibreSSL 2.7.3
>The agents are running on a mixture of OS's and they all have the same issue, Ubuntu 18 and 20 and Windows Server 2019 (amd64).
Are all failing after the update?
I've added the user+password scenario to my test environment https://github.com/kuisathaverat/jenkins-issues/tree/master/JENKINS-63790 I can not replicate the issue on
Jenkins
Debian GNU/Linux 9 OpenSSH_7.4p1 Debian-10+deb9u7, OpenSSL 1.0.2u 20 Dec 2019 openjdk version "1.8.0_242" OpenJDK Runtime Environment (build 1.8.0_242-b08) OpenJDK 64-Bit Server VM (build 25.242-b08, mixed mode)
Agents
Debian GNU/Linux 10 OpenSSH_7.9p1 Debian-10+deb10u2, OpenSSL 1.1.1d 10 Sep 2019 openjdk version "11.0.7" 2020-04-14 OpenJDK Runtime Environment 18.9 (build 11.0.7+10) OpenJDK 64-Bit Server VM 18.9 (build 11.0.7+10, mixed mode)
They all fail to start and from two different Windows Jenkins servers (talking to different Agents):
OpenSSH_6.6.1p1 Ubuntu-2ubuntu2.8, OpenSSL 1.0.1f 6 Jan 2014
OpenSSH_5.9p1 Debian-5ubuntu1.1, OpenSSL 1.0.1 14 Mar 2012
OpenSSH_7.2p2 Ubuntu-4ubuntu2.4, OpenSSL 1.0.2g 1 Mar 2016
java version "1.8.0_131"
Java(TM) SE Runtime Environment (build 1.8.0_131-b11)
Java HotSpot(TM) 64-Bit Server VM (build 25.131-b11, mixed mode)
OpenSSH_for_Windows_7.7p1, LibreSSL 2.6.5
java version "1.8.0_231"
Java(TM) SE Runtime Environment (build 1.8.0_231-b11)
Java HotSpot(TM) 64-Bit Server VM (build 25.231-b11, mixed mode)
Jenkins host
OpenSSH_for_Windows_7.7p1, LibreSSL 2.6.5
java version "1.8.0_261"
Java(TM) SE Runtime Environment (build 1.8.0_261-b12)
Java HotSpot(TM) Client VM (build 25.261-b12, mixed mode, sharing)
Let me know if you need more info.
The same thing happened in June with trilead-api-1.0.7 and trilead-api-1.0.8 fixed it.
This was a different issue that it is resolved, I reverted the change on 1.0.8, and include the fix on 1.0.9 but this has another bug related to keys protected with passwords so we revert the changes again, so 1.0.8, and 1.0.10 are equal they use (trilead-ssh2:build-217-jenkins-21).
I've added Ubuntu 14.04, Ubuntu 16:04, Ubuntu 18.04, and Ubuntu 20.04 to my test environment I can connect without issues to all of them.
Ubuntu 14.04.6 LTS OpenSSH_6.6.1p1 Ubuntu-2ubuntu2.13, OpenSSL 1.0.1f 6 Jan 2014 openjdk version "1.8.0_222" OpenJDK Runtime Environment (build 1.8.0_222-8u222-b10-1~14.04-b10) OpenJDK 64-Bit Server VM (build 25.222-b10, mixed mode)
Ubuntu 16.04.7 LTS OpenSSH_7.2p2 Ubuntu-4ubuntu2.10, OpenSSL 1.0.2g 1 Mar 2016 openjdk version "1.8.0_265" OpenJDK Runtime Environment (build 1.8.0_265-8u265-b01-0ubuntu2~16.04-b01) OpenJDK 64-Bit Server VM (build 25.265-b01, mixed mode)
Ubuntu 18.04.5 LTS OpenSSH_7.6p1 Ubuntu-4ubuntu0.3, OpenSSL 1.0.2n 7 Dec 2017 openjdk version "1.8.0_265" OpenJDK Runtime Environment (build 1.8.0_265-8u265-b01-0ubuntu2~18.04-b01) OpenJDK 64-Bit Server VM (build 25.265-b01, mixed mode)
Ubuntu 20.04.1 LTS OpenSSH_8.2p1 Ubuntu-4ubuntu0.1, OpenSSL 1.1.1f 31 Mar 2020 openjdk version "1.8.0_265" OpenJDK Runtime Environment (build 1.8.0_265-8u265-b01-0ubuntu2~20.04-b01) OpenJDK 64-Bit Server VM (build 25.265-b01, mixed mode)
At this point I think is not related to the environment, I have tested old OpenSSH versions, old Java versions, and old Operating Systems.
Could you share the config of one of those agents http://jenkins.example.com:8080/computer/MY_AGENT_NAME/config.xml? it should return something like this, you should remove the sensitive info with placeholders
<slave> <name>PRIVATE_KEY_RSA_AES_128_CBC</name> <description>SSH agent</description> <remoteFS>/home/jenkins</remoteFS> <numExecutors>1</numExecutors> <mode>NORMAL</mode> <retentionStrategy class="hudson.slaves.RetentionStrategy$Always"/> <launcher class="hudson.plugins.sshslaves.SSHLauncher" plugin="ssh-slaves@1.31.2"> <host>ssh-agent-rsa</host> <port>22</port> <credentialsId>PRIVATE_KEY_RSA_AES_128_CBC</credentialsId> <javaPath>/usr/local/openjdk-11/bin/java</javaPath> <launchTimeoutSeconds>210</launchTimeoutSeconds> <maxNumRetries>10</maxNumRetries> <retryWaitTime>15</retryWaitTime> <sshHostKeyVerificationStrategy class="hudson.plugins.sshslaves.verifiers.NonVerifyingKeyVerificationStrategy"/> </launcher> <label>ssh-agent</label> <nodeProperties/> </slave>
This is a Windows one
<slave> <name>PAXBUILD01</name> <description/> <remoteFS>D:\</remoteFS> <numExecutors>1</numExecutors> <mode>NORMAL</mode> <retentionStrategy class="hudson.slaves.RetentionStrategy$Always"/> <launcher class="hudson.plugins.sshslaves.SSHLauncher" plugin="ssh-slaves@1.31.2"> <host>PAXBUILD01</host> <port>22</port> <credentialsId>b83688a1-3972-46cb-9eff-43741f4c6099</credentialsId> <javaPath>"C:\Program Files\Java\jre1.8.0_231\bin\java.exe"</javaPath> <prefixStartSlaveCmd>D: && </prefixStartSlaveCmd> <launchTimeoutSeconds>60</launchTimeoutSeconds> <maxNumRetries>10</maxNumRetries> <retryWaitTime>15</retryWaitTime> <sshHostKeyVerificationStrategy class="hudson.plugins.sshslaves.verifiers.NonVerifyingKeyVerificationStrategy"/> <tcpNoDelay>true</tcpNoDelay> </launcher> <label>paxbuild paxton10-test winbuild net2entry</label> <nodeProperties> <jp.ikedam.jenkins.plugins.scoringloadbalancer.preferences.BuildPreferenceNodeProperty plugin="scoring-load-balancer@1.0.1"> <preference>0</preference> </jp.ikedam.jenkins.plugins.scoringloadbalancer.preferences.BuildPreferenceNodeProperty> </nodeProperties> </slave>
This is Linux
<slave> <name>PAXPHBLD-UB1 (Yocto)</name> <description>Ubuntu Build machine for Yocto AreaController</description> <remoteFS>/home/standby/jenkins</remoteFS> <numExecutors>1</numExecutors> <mode>EXCLUSIVE</mode> <retentionStrategy class="hudson.slaves.RetentionStrategy$Always"/> <launcher class="hudson.plugins.sshslaves.SSHLauncher" plugin="ssh-slaves@1.31.2"> <host>PAXPHBLD-UB1</host> <port>22</port> <credentialsId>bb278837-5d2e-407b-8c33-e69cb7812818</credentialsId> <launchTimeoutSeconds>60</launchTimeoutSeconds> <maxNumRetries>0</maxNumRetries> <retryWaitTime>0</retryWaitTime> <sshHostKeyVerificationStrategy class="hudson.plugins.sshslaves.verifiers.NonVerifyingKeyVerificationStrategy"/> <tcpNoDelay>true</tcpNoDelay> </launcher> <label>yocto</label> <nodeProperties/> </slave>
Master and slave: CentOS 7, OpenSSH_7.4p1, OpenSSL 1.0.2k-fips 26 Jan 2017
Slave's configuration:
<slave> <name>Jenkins</name> <description>XXX</description> <remoteFS>XXX</remoteFS> <numExecutors>5</numExecutors> <mode>NORMAL</mode> <retentionStrategy class="hudson.slaves.RetentionStrategy$Always"/> <launcher class="hudson.plugins.sshslaves.SSHLauncher" plugin="ssh-slaves@1.31.2"> <host>XXX</host> <port>22</port> <credentialsId>XXX</credentialsId> <launchTimeoutSeconds>60</launchTimeoutSeconds> <maxNumRetries>10</maxNumRetries> <retryWaitTime>15</retryWaitTime> <sshHostKeyVerificationStrategy class="hudson.plugins.sshslaves.verifiers.KnownHostsFileKeyVerificationStrategy"/> <tcpNoDelay>true</tcpNoDelay> </launcher> <label>linux</label> <nodeProperties> <hudson.tools.ToolLocationNodeProperty> <locations> <hudson.tools.ToolLocationNodeProperty_-ToolLocation> <type>hudson.plugins.git.GitTool$DescriptorImpl</type> <name>GitGlobalTool</name> <home>XXXX</home> </hudson.tools.ToolLocationNodeProperty_-ToolLocation> </locations> </hudson.tools.ToolLocationNodeProperty> </nodeProperties> </slave>
My agents are baremetal:
<slave> <name>####</name> <description>####</description> <remoteFS>####</remoteFS> <numExecutors>2</numExecutors> <mode>NORMAL</mode> <retentionStrategy class="hudson.slaves.RetentionStrategy$Always"/> <launcher class="hudson.plugins.sshslaves.SSHLauncher" plugin="ssh-slaves@1.31.2"> <host>####</host> <port>###</port> <credentialsId>####</credentialsId> <launchTimeoutSeconds>60</launchTimeoutSeconds> <maxNumRetries>10</maxNumRetries> <retryWaitTime>15</retryWaitTime> <sshHostKeyVerificationStrategy class="hudson.plugins.sshslaves.verifiers.KnownHostsFileKeyVerificationStrategy"/> <tcpNoDelay>true</tcpNoDelay> </launcher> <label>CentOS7</label> <nodeProperties/> </slave>
I am also running into this issue this morning after updating everything yesterday.
Jenkins version: 2.249.1
ssh build agents plugin: 1.31.2
Jenkins Master is Windows Server 2016 all slaves are ubuntu 18.04. All slaves are running as VMs in Hyper-V and failing to connect.
ssh versions:
node1: OpenSSH_7.6p1 Ubuntu-4ubuntu0.3, OpenSSL 1.0.2n 7 Dec 2017
node2: OpenSSH_7.2p2 Ubuntu-4ubuntu2.8, OpenSSL 1.0.2g 1 Mar 2016
node3: OpenSSH_7.6p1 Ubuntu-4ubuntu0.3, OpenSSL 1.0.2n 7 Dec 2017
node4: OpenSSH_7.6p1 Ubuntu-4ubuntu0.3, OpenSSL 1.0.2n 7 Dec 2017
All are running the same version of Java:
openjdk version "1.8.0_265"
OpenJDK Runtime Environment (build 1.8.0_265-8u265-b01-0ubuntu2~18.04-b01)
OpenJDK 64-Bit Server VM (build 25.265-b01, mixed mode)
I start thinking it is related to a change in how the timeout is managed in the trilead-ssh2 library, it fixes an issue to avoid an infinite wait but I think it breaks something. I will try to replicate it with VMs on the cloud and low timeouts, If I replicate it we have a winner.
I'd like to "me too" this ticket! My careless clicking "upgrade" was going fine until ssh-slaves pulled in and we ran into this.
My master is an old linux box with 1.8 and the build agents are newer linux with java 1.8
I followed the advice here and downloaded 1.0.10 from https://updates.jenkins.io/download/plugins/trilead-api/ into my plugins/ directory (rename the hpi to jpi) and restarted jenkins master to get connected again
I've added a note to the release notes, to warn people that the update can cause this issue on some systems.
I've experienced this too. Again master is on Windows and most nodes are Linux VMs in Azure. Some nodes are also Windows (SSH), others AIX and IBM i. All exhibit this behaviour.
It also knocked out connections to Git hosted in Azure DevOps (cloud).
Reverting to the previous version and restarting got things back up and running again.
I did not replicate the exact issue because finally it connects, but I see a weird timeout. In this case, I have used an e2.micro Ubuntu 16.04 VM in GCP, I will continue from this point to test a trilead-ssh2 library without the timeout change.
Sep 30, 2020 6:20:54 PM org.jenkinsci.remoting.engine.WorkDirManager setupLogging INFO: Both error and output logs will be printed to /home/inifc/remoting <===[JENKINS REMOTING CAPACITY]===>channel started Remoting version: 4.5 This is a Unix agent connect timed out SSH Connection failed with IOException: "connect timed out", retrying in 15 seconds. There are 7 more retries left. Evacuated stdout connect timed out SSH Connection failed with IOException: "connect timed out", retrying in 15 seconds. There are 4 more retries left. Agent successfully connected and online
ok, after reverting the change the GCP agent works as expected, what I wonder is why because the change seems fair it only add a 120s timeout to the Object.wait methods to avoid an infinite wait https://github.com/jenkinsci/trilead-ssh2/pull/50
Lost a day to this (downgrade resolves it):
Jenkins Server: Windows Server 2016 (VM)
2 MacOSX Agents (1 Catalina, 1 High Sierra, both Mac Mini bare metal)
Jenkins sat in an endless loop never connecting and on the MacOSX side the logs were filled with something to the effect of "sshd service exited with abnormal code 255" for each attempt.
Additionally, the naming of this plugin is unfortunate so I had no idea it was related to SSH until I found this ticket as a matter of lucky googling.
The release notes has a section known issues this ticket is linked there for a few days
No one is going to find release notes. Why not just release a .12 reverting the changes in .11 until things can be sorted out?
jglick because I cannot replicate the issue consistently and on my tests everything works. I will release an incremental this weekend with the possible fix, I’ll need someone that has the issue to check if the issue is resolved or not.
I have a test environment I could try it on if you can let me know how and how to revert.
ifernandezcalvo I'm seeing an issue with 1.0.11 in my Docker environment that uses JDK 11 on Alpine and a combination of GCP, other cloud, and local agents. I'm happy to try the 1.0.12 release as well. For the moment, I've reverted my installation to 1.0.10 so that I can continue testing Jenkins 2.249.2-rc.
finally, I did not get the incremental configured in time for the trilead-ssh2 lib, but it does not matter, I have uploaded the snapshot from revert-44-patch-2 Artifacts to the Artifactory(build-217-jenkins-25-SNAPSHOT), then I've bumped the version locally and generate a binary, you can install trilead-api.hpi manually from the plugins management page in the advanced tab, from there you submit the plugin and it will be installed, after restarting the instance the new version should be installed. If the change reverted is the cause of the issue everything would work, if not, to revert the change you have to go to the plugins management page in the installed tab, search for the trilead-api plugin, and downgrade to the previous version.
That pre-release allowed my 30 agents in various configurations to connect reliably with both JDK 8 and JDK 11 tests. The JDK 8 testing is running with Jenkins 2.249.1. The JDK 11 testing is running with Jenkins 2.249.2-rc.
SSH agents were connected from a Docker image of 2.249.1 and 2.249.2-rc including:
- CentOS 7 on Google Cloud
- CentOS 8 on Google Cloud
- Debian 9 on Google Cloud
- Debian 10 on Google Cloud
- Debian 10 on local network
- Debian testing on local network
- FreeBSD 12 on local network
- IBM PowerPC 64le on an IBM server
- IBM SystemZ on an IBM server
- OpenBSD 6.7 on local network
- Raspbian 10 on local network
- Ubuntu 18 on Google Cloud
- Ubuntu 20 on Google Cloud
- Windows 10 using Windows OpenSSH on local network
1.0.12-SNAPSHOT (private-4f699fb0-inifc) didn't work for me connecting to one Win2019 server.
nsleigh can you provide more details about the failure on your Windows 2019 server? Were you connecting through Windows OpenSSH or another SSH server? Does it work with 1.0.10? Does it fail with 1.0.11?
markewaite it is exactly the same as my original report (I reported this initially). It is Windows OpenSSH to Windows OpenSSH. v1.0.10 works and 1.0.11/1.0.12 fail in the same way.
I have reverted to v1.0.10 now and it is working again.
[10/05/20 13:13:06] [SSH] WARNING: SSH Host Keys are not being verified. Man-in-the-middle attacks may be possible against this connection.
Key exchange was not finished, connection is closed.
SSH Connection failed with IOException: "Key exchange was not finished, connection is closed.", retrying in 15 seconds. There are 1 more retries left.
[10/05/20 13:13:07] [SSH] WARNING: SSH Host Keys are not being verified. Man-in-the-middle attacks may be possible against this connection.
Key exchange was not finished, connection is closed.
[10/05/20 13:13:21] [SSH] WARNING: SSH Host Keys are not being verified. Man-in-the-middle attacks may be possible against this connection.
Key exchange was not finished, connection is closed.
ERROR: Connection is not established!
java.lang.IllegalStateException: Connection is not established!
at com.trilead.ssh2.Connection.getRemainingAuthMethods(Connection.java:988)
at com.cloudbees.jenkins.plugins.sshcredentials.impl.TrileadSSHPublicKeyAuthenticator.getRemainingAuthMethods(TrileadSSHPublicKeyAuthenticator.java:88)
at com.cloudbees.jenkins.plugins.sshcredentials.impl.TrileadSSHPublicKeyAuthenticator.canAuthenticate(TrileadSSHPublicKeyAuthenticator.java:80)
at com.cloudbees.jenkins.plugins.sshcredentials.SSHAuthenticator.newInstance(SSHAuthenticator.java:218)
at com.cloudbees.jenkins.plugins.sshcredentials.SSHAuthenticator.newInstance(SSHAuthenticator.java:171)
at hudson.plugins.sshslaves.SSHLauncher.openConnection(SSHLauncher.java:863)
at hudson.plugins.sshslaves.SSHLauncher$1.call(SSHLauncher.java:435)
at hudson.plugins.sshslaves.SSHLauncher$1.call(SSHLauncher.java:422)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
We are facing issues connecting EC2( amazonlinux) Jenkins executors where we can able successfully ssh from master to executors but not able to connect in Jenkinns UI after we updated to LTS 2.249.1 and latest version of trilead api plugin.
Do we know that it affects linux too? as they only mentioned about windows so far
markewaite sounds like you are able to reproduce a regression; have you tried bisecting https://github.com/jenkinsci/trilead-ssh2/compare/trilead-ssh2-build-217-jenkins-21...trilead-ssh2-build-217-jenkins-25 ? Looks like there were a bunch of significant changes.
yrsuryahttps://issues.jenkins-ci.org/browse/JENKINS-63790?focusedCommentId=398405&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-398405 multiple people have reported the same issue on linux
yrsurya my testing with trilead-api-plugin 1.0.11 with Docker images running Jenkins 2.249.1 and 2.249.2-rc on Linux showed that I was not reliably getting connections to all of my agents. The problem affects Linux as well as Windows as far as I can tell.
jglick I have not attempted to bisect the changes from trilead-api-plugin 1.0.10 to 1.0.11.
we are not seeing this in dev-jenkins were we have replica of prod but only difference dev running in EKS( kubenetes) agents able to connect using SSH Keys. Issue with prod Jenkins(running in EC2)
yrsurya some of my agents using trilead-api-plugin 1.0.11 connect successfully while others do not. I didn't see any pattern that I recognized.
All my agents connect successfully using the trilead-api-plugin 1.0.12 pre-release that is referenced by ifernandezcalvo. nsleigh reports that his Windows Server 2019 agents do not connect reliably with either trilead-api-plugin 1.0.11 or trilead-api-plugin 1.0.12 pre-release. I don't know what's different between his configuration and mine, since my Windows 10.0.1909 agents connect reliably with trilead-api-plugin 1.0.12 pre-release and do not all connect reliably with trilead-api-plugin 1.0.11.
We just upgraded to ver 1.0.11 and got the same error. Had to rollback to 1.0.10 and it connects to agent again.
Same result as Larry Charbonneau (and others. Updated to v1.0.11 and could not connect to agents via ssh. Ours are Linux clients. Downgrading to v1.0.8 resolves the issues.
INFO: Waiting for SSH to come up. Sleeping 5. Oct 07, 2020 1:34:05 PM hudson.plugins.ec2.EC2Cloud INFO: No SSH key verification (ssh-ed25519 76:0e:b5:a3:f9:04:g3:a6:d6:61:70:1b:df:bf:05:5c) for connections to EC2 (ec2-slave) - deploy-slave (...) Oct 07, 2020 1:34:05 PM hudson.plugins.ec2.EC2Cloud INFO: Failed to connect via ssh: There was a problem while connecting to ...
If someone else with a test environment could test the pre-release attached to this Jira, we can confirm if the fix works and we would release a version with the fix.
I tried the pre-release in attachment and it did not solve the agent-connect issues for us:
SSHLauncher{host='s204.ourcompany.nl', port=22, credentialsId='c48df730-9351-4574-9895-4ab8f483eca7', jvmOptions='-Djava.io.tmpdir=/jenkins/tmp', javaPath='', prefixStartSlaveCmd='', suffixStartSlaveCmd='', launchTimeoutSeconds=60, maxNumRetries=10, retryWaitTime=15, sshHostKeyVerificationStrategy=hudson.plugins.sshslaves.verifiers.KnownHostsFileKeyVerificationStrategy, tcpNoDelay=true, trackCredentials=true} [10/08/20 10:31:14] [SSH] Opening SSH connection to s204.ourcompany.nl:22. Searching for s204.ourcompany.nl in /opt/jenkins/.ssh/known_hosts Searching for s204.ourcompany.nl:22 in /opt/jenkins/.ssh/known_hosts [10/08/20 10:31:14] [SSH] SSH host key matches key in Known Hosts file. Connection will be allowed. Key exchange was not finished, connection is closed. SSH Connection failed with IOException: "Key exchange was not finished, connection is closed.", retrying in 15 seconds. There are 10 more retries left. Searching for s204.ourcompany.nl in /opt/jenkins/.ssh/known_hosts Searching for s204.ourcompany.nl:22 in /opt/jenkins/.ssh/known_hosts [10/08/20 10:31:30] [SSH] SSH host key matches key in Known Hosts file. Connection will be allowed. Key exchange was not finished, connection is closed. SSH Connection failed with IOException: "Key exchange was not finished, connection is closed.", retrying in 15 seconds. There are 9 more retries left. Searching for s204.ourcompany.nl in /opt/jenkins/.ssh/known_hosts Searching for s204.ourcompany.nl:22 in /opt/jenkins/.ssh/known_hosts [10/08/20 10:31:45] [SSH] SSH host key matches key in Known Hosts file. Connection will be allowed. Key exchange was not finished, connection is closed. SSH Connection failed with IOException: "Key exchange was not finished, connection is closed.", retrying in 15 seconds. There are 8 more retries left. Searching for s204.ourcompany.nl in /opt/jenkins/.ssh/known_hosts Searching for s204.ourcompany.nl:22 in /opt/jenkins/.ssh/known_hosts [10/08/20 10:32:01] [SSH] SSH host key matches key in Known Hosts file. Connection will be allowed. Key exchange was not finished, connection is closed. SSH Connection failed with IOException: "Key exchange was not finished, connection is closed.", retrying in 15 seconds. There are 7 more retries left.
Running with v1.0.10 works fine.
Private key is a 2048 bit RSA key, unencrypted
Both master and agent are on-prem CentOS servers:
- master is running CentOS Linux release 7.6.1810 (Core) / OpenSSH_7.4p1, OpenSSL 1.0.2k-fips 26 Jan 2017
- agent is running CentOS Linux release 7.8.2003 (Core) / OpenSSH_7.4p1, OpenSSL 1.0.2k-fips 26 Jan 2017
Jenkins v2.258
SSH Build Agents plugin v1.31.2
Same issue with OSX slave... master is a Windows 10 x64. Rolling back solves the issue.
Good news, I have an environment that replicates the issue, I've configured the EC2 plugin to provision t2.medium instances of Ubuntu 20.04 with java 8 installed
With trilead-api 1.0.10 it works
Connection from <IP> port 64966 on 172.20.1.252 port 22 rdomain "" debug1: Local version string SSH-2.0-OpenSSH_8.2p1 Ubuntu-4ubuntu0.1 debug1: Remote protocol version 2.0, remote software version TrileadSSH2Java_213 debug1: no match: TrileadSSH2Java_213 debug1: permanently_set_uid: 109/65534 [preauth debug1: list_hostkey_types: rsa-sha2-512,rsa-sha2-256,ssh-rsa,ecdsa-sha2-nistp256,ssh-ed25519 [preauth debug1: SSH2_MSG_KEXINIT sent [preauth User child is on pid 1419 debug1: do_cleanup debug1: PAM: cleanup debug1: PAM: closing session pam_unix(sshd:session): session closed for user ubuntu debug1: PAM: deleting credentials debug1: temporarily_use_uid: 1000/1000 (e=0/0) debug1: restore_uid: 0/0 debug1: audit_event: unhandled event 12 debug1: main_sigchld_handler: Child exited debug1: SSH2_MSG_KEXINIT received [preauth debug1: kex: algorithm: diffie-hellman-group-exchange-sha256 [preauth debug1: kex: host key algorithm: ssh-ed25519 [preauth debug1: kex: client->server cipher: aes256-ctr MAC: hmac-sha2-512 compression: none [preauth debug1: kex: server->client cipher: aes256-ctr MAC: hmac-sha2-512 compression: none [preauth debug1: expecting SSH2_MSG_KEX_DH_GEX_REQUEST [preauth debug1: SSH2_MSG_KEX_DH_GEX_REQUEST received [preauth debug1: SSH2_MSG_KEX_DH_GEX_GROUP sent [preauth debug1: expecting SSH2_MSG_KEX_DH_GEX_INIT [preauth debug1: rekey out after 4294967296 blocks [preauth debug1: SSH2_MSG_NEWKEYS sent [preauth debug1: expecting SSH2_MSG_NEWKEYS [preauth debug1: SSH2_MSG_NEWKEYS received [preauth debug1: rekey in after 4294967296 blocks [preauth debug1: KEX done [preauth debug1: userauth-request for user ubuntu service ssh-connection method none [preauth debug1: attempt 0 failures 0 [preauth debug1: PAM: initializing for "ubuntu" debug1: PAM: setting PAM_RHOST to "<IP>" debug1: PAM: setting PAM_TTY to "ssh" debug1: userauth-request for user ubuntu service ssh-connection method publickey [preauth debug1: attempt 1 failures 0 [preauth debug1: temporarily_use_uid: 1000/1000 (e=0/0) debug1: trying public key file /home/ubuntu/.ssh/authorized_keys debug1: fd 5 clearing O_NONBLOCK debug1: /home/ubuntu/.ssh/authorized_keys:1: matching key found: RSA SHA256:XXX debug1: /home/ubuntu/.ssh/authorized_keys:1: key options: agent-forwarding port-forwarding pty user-rc x11-forwarding Accepted key RSA SHA256:XXX found at /home/ubuntu/.ssh/authorized_keys:1 debug1: restore_uid: 0/0 debug1: auth_activate_options: setting new authentication options debug1: do_pam_account: called Accepted publickey for ubuntu from <IP> port 64966 ssh2: RSA SHA256:XXX debug1: monitor_child_preauth: ubuntu has been authenticated by privileged process debug1: auth_activate_options: setting new authentication options [preauth debug1: monitor_read_log: child log fd closed debug1: PAM: establishing credentials pam_unix(sshd:session): session opened for user ubuntu by (uid=0)
And it fails with trilead-api 1.0.11
Connection from <IP> port 64888 on 172.20.1.252 port 22 rdomain "" debug1: Local version string SSH-2.0-OpenSSH_8.2p1 Ubuntu-4ubuntu0.1 debug1: Remote protocol version 2.0, remote software version TrileadSSH2Java_213 debug1: no match: TrileadSSH2Java_213 debug1: permanently_set_uid: 109/65534 [preauth debug1: list_hostkey_types: rsa-sha2-512,rsa-sha2-256,ssh-rsa,ecdsa-sha2-nistp256,ssh-ed25519 [preauth debug1: SSH2_MSG_KEXINIT sent [preauth debug1: SSH2_MSG_KEXINIT received [preauth debug1: kex: algorithm: diffie-hellman-group-exchange-sha256 [preauth debug1: kex: host key algorithm: ssh-ed25519 [preauth debug1: kex: client->server cipher: aes256-ctr MAC: hmac-sha2-512 compression: none [preauth debug1: kex: server->client cipher: aes256-ctr MAC: hmac-sha2-512 compression: none [preauth debug1: expecting SSH2_MSG_KEX_DH_GEX_REQUEST [preauth debug1: SSH2_MSG_KEX_DH_GEX_REQUEST received [preauth debug1: SSH2_MSG_KEX_DH_GEX_GROUP sent [preauth debug1: expecting SSH2_MSG_KEX_DH_GEX_INIT [preauth debug1: rekey out after 4294967296 blocks [preauth debug1: SSH2_MSG_NEWKEYS sent [preauth debug1: expecting SSH2_MSG_NEWKEYS [preauth Connection closed by <IP> port 64888 [preauth debug1: do_cleanup [preauth debug1: monitor_read_log: child log fd closed debug1: do_cleanup debug1: Killing privsep child 1109 debug1: audit_event: unhandled event 12 debug1: main_sigchld_handler: Child exited debug1: Forked child 1110. debug1: Set /proc/self/oom_score_adj to 0 debug1: rexec start in 5 out 5 newsock 5 pipe 7 sock 8 debug1: inetd sockets after dupping: 4, 4
after taking a look at the logs for some reason in some environments the SSH2_MSG_NEWKEYS message is not sent from Jenkins, so the problem is in the Key negotiation. The pre-release attached revert a change related to the timeouts that are not related at all, the failure is in one of the PRs related to the new support for new algorithms.
My plan this weekend is to start with the version 1.0.10, and add the changes one by one testing the result with this environment, once I found the PR that causes the issue I will take a look at what can be the cause.
Could you give me more details about your environment? Which version of Jenkins do you have? Which version of ssh build agents plugin? which type of key do you use (DSA, RSA,...) and size? Is your key encrypted with a password? if so which algorithm you use? should be a case I have missed
I have tested:
for encrypted keys, I have tested