Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-64520

EC2 node not start after stop/disconnect with parameter Idle termination time

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Major Major
    • ec2-plugin
    • Debian 10
      Jenkins 2.263.1, 2.230
      Plugin Amazon EC2 version 1.56, 1.53

      We have a problem - after stopping ec2, the nodes do not start again. Playable on three jenkins servers. One of them is clean with plugin Amazon EC2. 

      Plugin settings and access to ec2 are the same

      Common settings on plugin EC2

      Idle termination time: 30

      Stop/Disconnect on Idle Timeout: yes

      Minimum number of instances: 0

      Minimum number of spare instances: 0
      Host key Verification Strategy: Off

      Connection stratagy: PublicDNS

       

      When I checked credentials and AMI - plugin says - success

       

      Logs on one node when I launch

      2020-12-28 12:33:18.949+0000 [id=25] INFO h.p.ec2.EC2RetentionStrategy#internalCheck: Idle timeout of EC2 (Test Amazon connection) - Jenkins PHP70 (same-instance-id) after 30 idle minutes, instance statusRUNNING
      2020-12-28 12:33:18.950+0000 [id=25] INFO h.plugins.ec2.EC2AbstractSlave#idleTimeout: EC2 instance idle time expired: same-instance-id
      2020-12-28 12:33:19.220+0000 [id=25] INFO h.plugins.ec2.EC2AbstractSlave#stop: EC2 instance stop request sent for same-instance-id
      2020-12-28 12:38:01.182+0000 [id=1078] INFO hudson.plugins.ec2.EC2Cloud#log: Launching instance: same-instance-id
      2020-12-28 12:38:01.183+0000 [id=1078] INFO hudson.plugins.ec2.EC2Cloud#log: bootstrap()
      2020-12-28 12:38:01.184+0000 [id=1078] INFO hudson.plugins.ec2.EC2Cloud#log: Getting keypair...
      2020-12-28 12:38:01.184+0000 [id=1078] INFO hudson.plugins.ec2.EC2Cloud#log: Using private key jenkins (SHA-1 fingerprint same-fingerprint)
      2020-12-28 12:38:01.184+0000 [id=1078] INFO hudson.plugins.ec2.EC2Cloud#log: Authenticating as jenkins
      2020-12-28 12:38:01.286+0000 [id=1078] INFO hudson.plugins.ec2.EC2Cloud#log: Connecting to null on port 22, with timeout 10000.
      2020-12-28 12:38:01.287+0000 [id=1078] INFO hudson.plugins.ec2.EC2Cloud#log: Failed to connect via ssh: There was a problem while connecting to null:22
      2020-12-28 12:38:01.288+0000 [id=1078] INFO hudson.plugins.ec2.EC2Cloud#log: Waiting for SSH to come up. Sleeping 5.
      2020-12-28 12:38:06.363+0000 [id=1078] INFO hudson.plugins.ec2.EC2Cloud#log: Connecting to null on port 22, with timeout 10000.
      2020-12-28 12:38:06.364+0000 [id=1078] INFO hudson.plugins.ec2.EC2Cloud#log: Failed to connect via ssh: There was a problem while connecting to null:22
      2020-12-28 12:38:06.364+0000 [id=1078] INFO hudson.plugins.ec2.EC2Cloud#log: Waiting for SSH to come up. Sleeping 5.

       

      On other jenkins server launch says:

      Dec 28, 2020 7:15:36 AM hudson.plugins.ec2.EC2Cloud
      INFO: Connecting to private-ip-ec2-instance.compute.internal on port 22, with timeout 10000.
      Dec 28, 2020 7:15:36 AM hudson.plugins.ec2.EC2Cloud
      INFO: Failed to connect via ssh: There was a problem while connecting to private-ip-ec2-instance:22
      Dec 28, 2020 7:15:36 AM hudson.plugins.ec2.EC2Cloud
      INFO: Waiting for SSH to come up. Sleeping 5.
       

          [JENKINS-64520] EC2 node not start after stop/disconnect with parameter Idle termination time

          Nikolay created issue -
          Nikolay made changes -
          Description Original: We have a problem - after stopping ec2, the nodes do not start again. Playable on three jenkins servers. One of them is clean with plugin Amazon EC2. 

          Plugin settings and access to ec2 are the same

          Common settings on plugin EC2

          Idle termination time: 30

          Stop/Disconnect on Idle Timeout: yes

          Minimum number of instances: 0

          Minimum number of spare instances: 0
          Host key Verification Strategy: Off

          Connection stratagy: PublicDNS

           

          When I checked credentials and AMI - plugin says - success

           

          Logs on one node when I launch

          INFO: Connecting to null on port 22, with timeout 10000.
          Dec 28, 2020 8:00:07 AM hudson.plugins.ec2.EC2Cloud
          INFO: Failed to connect via ssh: There was a problem while connecting to null:22
          Dec 28, 2020 8:00:07 AM hudson.plugins.ec2.EC2Cloud
          INFO: Waiting for SSH to come up. Sleeping 5.

           

          On other jenkins server launch says:

          Dec 28, 2020 7:15:36 AM hudson.plugins.ec2.EC2Cloud
          INFO: Connecting to private-ip-ec2-instance.compute.internal on port 22, with timeout 10000.
          Dec 28, 2020 7:15:36 AM hudson.plugins.ec2.EC2Cloud
          INFO: Failed to connect via ssh: There was a problem while connecting to private-ip-ec2-instance:22
          Dec 28, 2020 7:15:36 AM hudson.plugins.ec2.EC2Cloud
          INFO: Waiting for SSH to come up. Sleeping 5.
          New: We have a problem - after stopping ec2, the nodes do not start again. Playable on three jenkins servers. One of them is clean with plugin Amazon EC2. 

          Plugin settings and access to ec2 are the same

          Common settings on plugin EC2

          Idle termination time: 30

          Stop/Disconnect on Idle Timeout: yes

          Minimum number of instances: 0

          Minimum number of spare instances: 0
           Host key Verification Strategy: Off

          Connection stratagy: PublicDNS

           

          When I checked credentials and AMI - plugin says - success

           

          Logs on one node when I launch

          2020-12-28 12:33:18.949+0000 [id=25] INFO h.p.ec2.EC2RetentionStrategy#internalCheck: Idle timeout of EC2 (Test Amazon connection) - Jenkins PHP70 (same-instance-id) after 30 idle minutes, instance statusRUNNING
          2020-12-28 12:33:18.950+0000 [id=25] INFO h.plugins.ec2.EC2AbstractSlave#idleTimeout: EC2 instance idle time expired: same-instance-id
          2020-12-28 12:33:19.220+0000 [id=25] INFO h.plugins.ec2.EC2AbstractSlave#stop: EC2 instance stop request sent for same-instance-id
          2020-12-28 12:38:01.182+0000 [id=1078] INFO hudson.plugins.ec2.EC2Cloud#log: Launching instance: same-instance-id
          2020-12-28 12:38:01.183+0000 [id=1078] INFO hudson.plugins.ec2.EC2Cloud#log: bootstrap()
          2020-12-28 12:38:01.184+0000 [id=1078] INFO hudson.plugins.ec2.EC2Cloud#log: Getting keypair...
          2020-12-28 12:38:01.184+0000 [id=1078] INFO hudson.plugins.ec2.EC2Cloud#log: Using private key jenkins (SHA-1 fingerprint same-fingerprint)
          2020-12-28 12:38:01.184+0000 [id=1078] INFO hudson.plugins.ec2.EC2Cloud#log: Authenticating as jenkins
          2020-12-28 12:38:01.286+0000 [id=1078] INFO hudson.plugins.ec2.EC2Cloud#log: Connecting to null on port 22, with timeout 10000.
          2020-12-28 12:38:01.287+0000 [id=1078] INFO hudson.plugins.ec2.EC2Cloud#log: Failed to connect via ssh: There was a problem while connecting to null:22
          2020-12-28 12:38:01.288+0000 [id=1078] INFO hudson.plugins.ec2.EC2Cloud#log: Waiting for SSH to come up. Sleeping 5.
          2020-12-28 12:38:06.363+0000 [id=1078] INFO hudson.plugins.ec2.EC2Cloud#log: Connecting to null on port 22, with timeout 10000.
          2020-12-28 12:38:06.364+0000 [id=1078] INFO hudson.plugins.ec2.EC2Cloud#log: Failed to connect via ssh: There was a problem while connecting to null:22
          2020-12-28 12:38:06.364+0000 [id=1078] INFO hudson.plugins.ec2.EC2Cloud#log: Waiting for SSH to come up. Sleeping 5.

           

          On other jenkins server launch says:

          Dec 28, 2020 7:15:36 AM hudson.plugins.ec2.EC2Cloud
           INFO: Connecting to private-ip-ec2-instance.compute.internal on port 22, with timeout 10000.
           Dec 28, 2020 7:15:36 AM hudson.plugins.ec2.EC2Cloud
           INFO: Failed to connect via ssh: There was a problem while connecting to private-ip-ec2-instance:22
           Dec 28, 2020 7:15:36 AM hudson.plugins.ec2.EC2Cloud
           INFO: Waiting for SSH to come up. Sleeping 5.
           

          Can you confirm that the servers are reachable via SSH?

          Raihaan Shouhell added a comment - Can you confirm that the servers are reachable via SSH?

          Nikolay added a comment -

          They are available via SSH when the servers are running. 

          But problem is that they don't start when you do "Launch agent".

          Nikolay added a comment - They are available via SSH when the servers are running.  But problem is that they don't start when you do "Launch agent".

          The bottom log seems correct showing the supposed host and the above one does show null as the host.

          Do you mean that the nodes have been stopped on the EC2 console and you expect the behaviour of Launch agent to start the ec2 instance and attempt to connect?

          Raihaan Shouhell added a comment - The bottom log seems correct showing the supposed host and the above one does show null as the host. Do you mean that the nodes have been stopped on the EC2 console and you expect the behaviour of Launch agent to start the ec2 instance and attempt to connect?
          Piyush Prajapati made changes -
          Status Original: Open [ 1 ] New: In Progress [ 3 ]
          Piyush Prajapati made changes -
          Status Original: In Progress [ 3 ] New: Open [ 1 ]

          Nikolay added a comment - - edited

          Yes, it worked for me in version EC2 plugin ~1.4 . 

          Nikolay added a comment - - edited Yes, it worked for me in version EC2 plugin ~1.4 . 

          Nikolay added a comment -

          In all cases described in the logs - VM in ec2 won't start again. I don't see any error logs related to this. The user in IAM has maximum rights, as far as I know.

          Nikolay added a comment - In all cases described in the logs - VM in ec2 won't start again. I don't see any error logs related to this. The user in IAM has maximum rights, as far as I know.

          AFAIK, the launch agent button has never actually attempted to start a stopped instance.

          Raihaan Shouhell added a comment - AFAIK, the launch agent button has never actually attempted to start a stopped instance.

            thoulen FABRIZIO MANFREDI
            nlopyrev Nikolay
            Votes:
            3 Vote for this issue
            Watchers:
            7 Start watching this issue

              Created:
              Updated: