Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-72298

Allow configurable timeout for key exchange with EC2 agents

XMLWordPrintable

    • Icon: Improvement Improvement
    • Resolution: Unresolved
    • Icon: Major Major
    • ec2-plugin
    • None
    • server Jenkins version 2.414.3 on debian 11 - EC2 plugin version 1628.v6d7b_fc58b_a_1d

      Hello, 

      — Description updated (rephrase as requested by Mark) —

      I faced an issue since yesterday, no update done on server side or in the plugins, i think AWS may have a performance issue, but now i can not deploy anymore out windows server jenkins slave due to timeout in key Exchange

      The jenkins master deploy an instance, but i have a connection timeout and after 5 retries the instances are destroyed. The ami used and network config is good, i can connect before it is destroyed, and i can manually deploy a instance with the same parameters than defined in Jenkins.

      Here is the logs server side:

      Nov 07 16:16:41 jenkins-cicd-us-east-1f-0 jenkins[5694]: 2023-11-07 16:16:41.774+0000 [id=42]        INFO        h.p.ec2.EC2RetentionStrategy#start: Start requested for EC2 (ec2-cicd) - Win-BF-ue52 (i-02afeb443e0d9ba1e)
      Nov 07 16:16:41 jenkins-cicd-us-east-1f-0 jenkins[5694]: 2023-11-07 16:16:41.775+0000 [id=287]        INFO        hudson.plugins.ec2.EC2Cloud#log: Launching instance: i-02afeb443e0d9ba1e
      Nov 07 16:16:41 jenkins-cicd-us-east-1f-0 jenkins[5694]: 2023-11-07 16:16:41.776+0000 [id=287]        INFO        hudson.plugins.ec2.EC2Cloud#log: bootstrap()
      Nov 07 16:16:41 jenkins-cicd-us-east-1f-0 jenkins[5694]: 2023-11-07 16:16:41.776+0000 [id=287]        INFO        hudson.plugins.ec2.EC2Cloud#log: Getting keypair...
      Nov 07 16:16:41 jenkins-cicd-us-east-1f-0 jenkins[5694]: 2023-11-07 16:16:41.777+0000 [id=287]        INFO        hudson.plugins.ec2.EC2Cloud#log: Using private key jenkins (SHA-1 fingerprint e3:70:f3:e8:54:8e:39:20:1e:36:4a:e8:61:07:6b:76:23:17:69:35)
      Nov 07 16:16:41 jenkins-cicd-us-east-1f-0 jenkins[5694]: 2023-11-07 16:16:41.777+0000 [id=287]        INFO        hudson.plugins.ec2.EC2Cloud#log: Authenticating as jenkins
      Nov 07 16:16:41 jenkins-cicd-us-east-1f-0 jenkins[5694]: 2023-11-07 16:16:41.780+0000 [id=42]        INFO        hudson.slaves.NodeProvisioner#update: EC2 (ec2-cicd) - Win-BF-ue52 provisioning successfully completed. We have now 17 computer(s)
      Nov 07 16:16:41 jenkins-cicd-us-east-1f-0 jenkins[5694]: 2023-11-07 16:16:41.860+0000 [id=287]        INFO        hudson.plugins.ec2.EC2Cloud#log: Connecting to 10.6.5.237 on port 22, with timeout 10000.
      Nov 07 16:16:51 jenkins-cicd-us-east-1f-0 jenkins[5694]: 2023-11-07 16:16:51.861+0000 [id=287]        INFO        hudson.plugins.ec2.EC2Cloud#log: Failed to connect via ssh: The kexTimeout (10000 ms) expired.
      Nov 07 16:16:51 jenkins-cicd-us-east-1f-0 jenkins[5694]: 2023-11-07 16:16:51.862+0000 [id=287]        INFO        hudson.plugins.ec2.EC2Cloud#log: Waiting for SSH to come up. Sleeping 5.
      Nov 07 16:16:56 jenkins-cicd-us-east-1f-0 jenkins[5694]: 2023-11-07 16:16:56.939+0000 [id=287]        INFO        hudson.plugins.ec2.EC2Cloud#log: Connecting to 10.6.5.237 on port 22, with timeout 10000.
      Nov 07 16:17:06 jenkins-cicd-us-east-1f-0 jenkins[5694]: 2023-11-07 16:17:06.941+0000 [id=287]        INFO        hudson.plugins.ec2.EC2Cloud#log: Failed to connect via ssh: The kexTimeout (10000 ms) expired.
      Nov 07 16:17:06 jenkins-cicd-us-east-1f-0 jenkins[5694]: 2023-11-07 16:17:06.942+0000 [id=287]        INFO        hudson.plugins.ec2.EC2Cloud#log: Waiting for SSH to come up. Sleeping 5.
      Nov 07 16:17:12 jenkins-cicd-us-east-1f-0 jenkins[5694]: 2023-11-07 16:17:12.340+0000 [id=320]        INFO        h.plugins.ec2.EC2OndemandSlave#lambda$terminate$0: Terminated EC2 instance (terminated): i-02afeb443e0d9ba1e
      Nov 07 16:17:12 jenkins-cicd-us-east-1f-0 jenkins[5694]: 2023-11-07 16:17:12.344+0000 [id=320]        INFO        h.plugins.ec2.EC2OndemandSlave#lambda$terminate$0: Removed EC2 instance from jenkins controller: i-02afeb443e0d9ba1e
      Nov 07 16:17:18 jenkins-cicd-us-east-1f-0 jenkins[5694]: [11/07/23 16:17:18] SSH Launch of test-win-bf-ue51 on 10.6.5.26 failed in 198,856 ms

       

      i saw some issues here: https://issues.jenkins.io/browse/JENKINS-30284?page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel&showAll=true, but it doesn't apply here, after searching in code, i found that ssh connection is using the lib  https://javadoc.io/doc/com.trilead/trilead-ssh2/latest/com/trilead/ssh2/Connection.html , and each call to the Connect are using Connection(host, port) and we can not specify the connectTimeout and KexTimeout parameters defined in the lib public ConnectionInfo connect(ServerHostKeyVerifier verifier, int connectTimeout, int kexTimeout) throws java.io.IOException.

      It should be interesting to define parameters as integer as jenkins.ec2.sshConnectTimeout and jenkins.ec2.sshKexTimeout, and call the connect function with those new parameters; so i think it could fix the issue concerning the timeout on KeyExchange

      Thanks

      Best Regards 

            thoulen FABRIZIO MANFREDI
            stfdel S
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated: