• Icon: Bug Bug
    • Resolution: Cannot Reproduce
    • Icon: Blocker Blocker
    • ec2-plugin
    • Jenkins ver. 2.138.1

      Similar to https://issues.jenkins-ci.org/browse/JENKINS-53876

      Unfortunately with the latest 1.40.1 EC2 nodes are not launching anymore:

      $ cat Jenkins\ Prebuilt\ Slave\ \(sir-p9ai6v8m\)/slave.log
      Oct 08, 2018 10:43:49 PM hudson.plugins.ec2.EC2Cloud
      INFO: Launching instance: null
      Oct 08, 2018 10:43:49 PM hudson.plugins.ec2.EC2Cloud
      INFO: bootstrap()
      Oct 08, 2018 10:43:49 PM hudson.plugins.ec2.EC2Cloud
      INFO: Getting keypair...
      Oct 08, 2018 10:43:49 PM hudson.plugins.ec2.EC2Cloud
      INFO: Using key: master
      xxx
      -----BEGIN RSA PRIVATE KEY-----
      xxx
      Oct 08, 2018 10:43:49 PM hudson.plugins.ec2.EC2Cloud
      INFO: Authenticating as ubuntu
      ERROR: Unexpected error in launching an agent. This is probably a bug in Jenkins
      java.lang.NullPointerException
      	at hudson.plugins.ec2.ssh.EC2UnixLauncher.getEC2HostAddress(EC2UnixLauncher.java:368)
      	at hudson.plugins.ec2.ssh.EC2UnixLauncher.connectToSsh(EC2UnixLauncher.java:318)
      	at hudson.plugins.ec2.ssh.EC2UnixLauncher.bootstrap(EC2UnixLauncher.java:282)
      	at hudson.plugins.ec2.ssh.EC2UnixLauncher.launchScript(EC2UnixLauncher.java:130)
      	at hudson.plugins.ec2.EC2ComputerLauncher.launch(EC2ComputerLauncher.java:48)
      	at hudson.slaves.SlaveComputer$1.call(SlaveComputer.java:294)
      	at jenkins.util.ContextResettingExecutorService$2.call(ContextResettingExecutorService.java:46)
      	at jenkins.security.ImpersonatingExecutorService$2.call(ImpersonatingExecutorService.java:71)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
      	at java.lang.Thread.run(Thread.java:748)
      

       

      Reverting back to 1.39 solves the issue.

      I see m5d instances being mentioned in the changelog - which we are using - perhaps related? 

       

       

          [JENKINS-53952] Linux agents are not starting anymore

          Günter Grodotzki created issue -

          I've found this hit or miss, sometimes it happens sometimes it doesn't.

           

          Mostly it happens when we've changed pre-existing connections and it only happens for spot instances. Switching our instances to on demand does not have the same problem.

          Ross Derewianko added a comment - I've found this hit or miss, sometimes it happens sometimes it doesn't.   Mostly it happens when we've changed pre-existing connections and it only happens for spot instances. Switching our instances to on demand does not have the same problem.
          David Hayes made changes -
          Link New: This issue is duplicated by JENKINS-54071 [ JENKINS-54071 ]
          FABRIZIO MANFREDI made changes -
          Link Original: This issue is duplicated by JENKINS-54071 [ JENKINS-54071 ]

          Do you know if the spot instance after the restart is still alive ? 

          Are you in VPC or in Default ? 

          Are you using public IP ? 

          FABRIZIO MANFREDI added a comment - Do you know if the spot instance after the restart is still alive ?  Are you in VPC or in Default ?  Are you using public IP ? 

          Perrin Morrow added a comment -

          We just started hitting this. We've been running 1.40.1 for a couple of days (and the latest Jenkins version) without seeing it but then I added a new agent template, and trying to manually launch an agent from the template caused this error to occur. I haven't seen it in any of the instances that were started by the node provisioner, but I do see it when I manually launch an agent using one of the templates that has been working fine for the last few days. It doesn't happen every time though.

          We use spot instances, in a VPC, with no public IP. The instance is left running afterwards. I haven't seen it happen with on-demand instances.

          Perrin Morrow added a comment - We just started hitting this. We've been running 1.40.1 for a couple of days (and the latest Jenkins version) without seeing it but then I added a new agent template, and trying to manually launch an agent from the template caused this error to occur. I haven't seen it in any of the instances that were started by the node provisioner, but I do see it when I manually launch an agent using one of the templates that has been working fine for the last few days. It doesn't happen every time though. We use spot instances, in a VPC, with no public IP. The instance is left running afterwards. I haven't seen it happen with on-demand instances.

          Nick Lloyd added a comment -

          We consistently hit this error using spot instances with the latest LTS Jenkins version.

          Using spot instances in a VPC with no public IP.

          Nick Lloyd added a comment - We consistently hit this error using spot instances with the latest LTS Jenkins version. Using spot instances in a VPC with no public IP.

          Matt Hoy added a comment -

          Seeing identical issue using M5.Xlarge, latest Jenkins, and Spot Instances. If I hit "Launch Agent" before the timeout is hit, but after it's completed booting it is successfully able to add it. 

          Matt Hoy added a comment - Seeing identical issue using M5.Xlarge, latest Jenkins, and Spot Instances. If I hit "Launch Agent" before the timeout is hit, but after it's completed booting it is successfully able to add it. 
          FABRIZIO MANFREDI made changes -
          Comment [ [https://github.com/jenkinsci/ec2-plugin/pull/252 :|https://github.com/jenkinsci/ec2-plugin/pull/252]

          It implements very simple algorithm:
           * if NodeProvisioner requested X executors, start to raise
          X/N nodes (N is number of executors on each node).
           * if there is any orphant nodes (~not known by Jenkins at all), take
          them in account (and start if it's neccessary)
           * if there is not enough orphant nodes - raise needed nodes in parallel
           * give them back to NodeProvisioner in RUNNING state

            ]

          Still having it with c5.xlarge + plugin version 1.41

          Günter Grodotzki added a comment - Still having it with c5.xlarge + plugin version 1.41

            mramonleon Ramon Leon
            lifeofguenter Günter Grodotzki
            Votes:
            14 Vote for this issue
            Watchers:
            25 Start watching this issue

              Created:
              Updated:
              Resolved: