• Icon: Bug Bug
    • Resolution: Cannot Reproduce
    • Icon: Blocker Blocker
    • ec2-plugin
    • Jenkins ver. 2.138.1

      Similar to https://issues.jenkins-ci.org/browse/JENKINS-53876

      Unfortunately with the latest 1.40.1 EC2 nodes are not launching anymore:

      $ cat Jenkins\ Prebuilt\ Slave\ \(sir-p9ai6v8m\)/slave.log
      Oct 08, 2018 10:43:49 PM hudson.plugins.ec2.EC2Cloud
      INFO: Launching instance: null
      Oct 08, 2018 10:43:49 PM hudson.plugins.ec2.EC2Cloud
      INFO: bootstrap()
      Oct 08, 2018 10:43:49 PM hudson.plugins.ec2.EC2Cloud
      INFO: Getting keypair...
      Oct 08, 2018 10:43:49 PM hudson.plugins.ec2.EC2Cloud
      INFO: Using key: master
      xxx
      -----BEGIN RSA PRIVATE KEY-----
      xxx
      Oct 08, 2018 10:43:49 PM hudson.plugins.ec2.EC2Cloud
      INFO: Authenticating as ubuntu
      ERROR: Unexpected error in launching an agent. This is probably a bug in Jenkins
      java.lang.NullPointerException
      	at hudson.plugins.ec2.ssh.EC2UnixLauncher.getEC2HostAddress(EC2UnixLauncher.java:368)
      	at hudson.plugins.ec2.ssh.EC2UnixLauncher.connectToSsh(EC2UnixLauncher.java:318)
      	at hudson.plugins.ec2.ssh.EC2UnixLauncher.bootstrap(EC2UnixLauncher.java:282)
      	at hudson.plugins.ec2.ssh.EC2UnixLauncher.launchScript(EC2UnixLauncher.java:130)
      	at hudson.plugins.ec2.EC2ComputerLauncher.launch(EC2ComputerLauncher.java:48)
      	at hudson.slaves.SlaveComputer$1.call(SlaveComputer.java:294)
      	at jenkins.util.ContextResettingExecutorService$2.call(ContextResettingExecutorService.java:46)
      	at jenkins.security.ImpersonatingExecutorService$2.call(ImpersonatingExecutorService.java:71)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
      	at java.lang.Thread.run(Thread.java:748)
      

       

      Reverting back to 1.39 solves the issue.

      I see m5d instances being mentioned in the changelog - which we are using - perhaps related? 

       

       

          [JENKINS-53952] Linux agents are not starting anymore

          I've found this hit or miss, sometimes it happens sometimes it doesn't.

           

          Mostly it happens when we've changed pre-existing connections and it only happens for spot instances. Switching our instances to on demand does not have the same problem.

          Ross Derewianko added a comment - I've found this hit or miss, sometimes it happens sometimes it doesn't.   Mostly it happens when we've changed pre-existing connections and it only happens for spot instances. Switching our instances to on demand does not have the same problem.

          Do you know if the spot instance after the restart is still alive ? 

          Are you in VPC or in Default ? 

          Are you using public IP ? 

          FABRIZIO MANFREDI added a comment - Do you know if the spot instance after the restart is still alive ?  Are you in VPC or in Default ?  Are you using public IP ? 

          Perrin Morrow added a comment -

          We just started hitting this. We've been running 1.40.1 for a couple of days (and the latest Jenkins version) without seeing it but then I added a new agent template, and trying to manually launch an agent from the template caused this error to occur. I haven't seen it in any of the instances that were started by the node provisioner, but I do see it when I manually launch an agent using one of the templates that has been working fine for the last few days. It doesn't happen every time though.

          We use spot instances, in a VPC, with no public IP. The instance is left running afterwards. I haven't seen it happen with on-demand instances.

          Perrin Morrow added a comment - We just started hitting this. We've been running 1.40.1 for a couple of days (and the latest Jenkins version) without seeing it but then I added a new agent template, and trying to manually launch an agent from the template caused this error to occur. I haven't seen it in any of the instances that were started by the node provisioner, but I do see it when I manually launch an agent using one of the templates that has been working fine for the last few days. It doesn't happen every time though. We use spot instances, in a VPC, with no public IP. The instance is left running afterwards. I haven't seen it happen with on-demand instances.

          Nick Lloyd added a comment -

          We consistently hit this error using spot instances with the latest LTS Jenkins version.

          Using spot instances in a VPC with no public IP.

          Nick Lloyd added a comment - We consistently hit this error using spot instances with the latest LTS Jenkins version. Using spot instances in a VPC with no public IP.

          Matt Hoy added a comment -

          Seeing identical issue using M5.Xlarge, latest Jenkins, and Spot Instances. If I hit "Launch Agent" before the timeout is hit, but after it's completed booting it is successfully able to add it. 

          Matt Hoy added a comment - Seeing identical issue using M5.Xlarge, latest Jenkins, and Spot Instances. If I hit "Launch Agent" before the timeout is hit, but after it's completed booting it is successfully able to add it. 

          Still having it with c5.xlarge + plugin version 1.41

          Günter Grodotzki added a comment - Still having it with c5.xlarge + plugin version 1.41

          Anton Yurchenko added a comment - - edited

          We've got the same issue. Jenkins ver 2.150.1, plugin version 1.41.

          That issue occurs only when requesting a spot instance.

          Dec 19, 2018 7:27:36 AM hudson.plugins.ec2.EC2Cloud
          INFO: Launching instance: null
          Dec 19, 2018 7:27:36 AM hudson.plugins.ec2.EC2Cloud
          INFO: bootstrap()
          Dec 19, 2018 7:27:36 AM hudson.plugins.ec2.EC2Cloud
          INFO: Getting keypair...
          Dec 19, 2018 7:27:37 AM hudson.plugins.ec2.EC2Cloud
          INFO: Using key: ssh-key
          [..............................]
          Dec 19, 2018 7:27:37 AM hudson.plugins.ec2.EC2Cloud
          INFO: Authenticating as ec2-user
          ERROR: Unexpected error in launching an agent. This is probably a bug in Jenkins
          java.lang.NullPointerException
          	at hudson.plugins.ec2.ssh.EC2UnixLauncher.getEC2HostAddress(EC2UnixLauncher.java:367)
          	at hudson.plugins.ec2.ssh.EC2UnixLauncher.connectToSsh(EC2UnixLauncher.java:319)
          	at hudson.plugins.ec2.ssh.EC2UnixLauncher.bootstrap(EC2UnixLauncher.java:283)
          	at hudson.plugins.ec2.ssh.EC2UnixLauncher.launchScript(EC2UnixLauncher.java:130)
          	at hudson.plugins.ec2.EC2ComputerLauncher.launch(EC2ComputerLauncher.java:48)
          	at hudson.slaves.SlaveComputer$1.call(SlaveComputer.java:294)
          	at jenkins.util.ContextResettingExecutorService$2.call(ContextResettingExecutorService.java:46)
          	at jenkins.security.ImpersonatingExecutorService$2.call(ImpersonatingExecutorService.java:71)
          	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
          	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
          	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
          	at java.lang.Thread.run(Thread.java:748)
          
          Dec 19, 2018 7:27:35 AM hudson.plugins.ec2.SlaveTemplate provisionSpot
          INFO: Launching ami-04750af7488d87g5c for template builder
          Dec 19, 2018 7:27:36 AM hudson.plugins.ec2.SlaveTemplate provisionSpot
          INFO: Spot instance id in provision: sir-8mki4f8k
          Dec 19, 2018 7:27:36 AM hudson.plugins.ec2.EC2RetentionStrategy start
          INFO: Start requested for builder (sir-8mki4f8k)
          Dec 19, 2018 7:27:36 AM hudson.plugins.ec2.EC2Cloud log
          INFO: Launching instance: null
          Dec 19, 2018 7:27:36 AM hudson.plugins.ec2.EC2Cloud log
          INFO: bootstrap()
          Dec 19, 2018 7:27:36 AM hudson.plugins.ec2.EC2Cloud log
          INFO: Getting keypair...
          Dec 19, 2018 7:27:37 AM hudson.plugins.ec2.EC2Cloud log
          INFO: Using key: ssh-key
          [................................]
          Dec 19, 2018 7:27:37 AM hudson.plugins.ec2.EC2Cloud log
          INFO: Authenticating as ec2-user
          

          Manually clicking Launch Agent works and instances comes online fine.

          Anton Yurchenko added a comment - - edited We've got the same issue. Jenkins ver 2.150.1, plugin version 1.41. That issue occurs only when requesting a spot instance. Dec 19, 2018 7:27:36 AM hudson.plugins.ec2.EC2Cloud INFO: Launching instance: null Dec 19, 2018 7:27:36 AM hudson.plugins.ec2.EC2Cloud INFO: bootstrap() Dec 19, 2018 7:27:36 AM hudson.plugins.ec2.EC2Cloud INFO: Getting keypair... Dec 19, 2018 7:27:37 AM hudson.plugins.ec2.EC2Cloud INFO: Using key: ssh-key [..............................] Dec 19, 2018 7:27:37 AM hudson.plugins.ec2.EC2Cloud INFO: Authenticating as ec2-user ERROR: Unexpected error in launching an agent. This is probably a bug in Jenkins java.lang.NullPointerException at hudson.plugins.ec2.ssh.EC2UnixLauncher.getEC2HostAddress(EC2UnixLauncher.java:367) at hudson.plugins.ec2.ssh.EC2UnixLauncher.connectToSsh(EC2UnixLauncher.java:319) at hudson.plugins.ec2.ssh.EC2UnixLauncher.bootstrap(EC2UnixLauncher.java:283) at hudson.plugins.ec2.ssh.EC2UnixLauncher.launchScript(EC2UnixLauncher.java:130) at hudson.plugins.ec2.EC2ComputerLauncher.launch(EC2ComputerLauncher.java:48) at hudson.slaves.SlaveComputer$1.call(SlaveComputer.java:294) at jenkins.util.ContextResettingExecutorService$2.call(ContextResettingExecutorService.java:46) at jenkins.security.ImpersonatingExecutorService$2.call(ImpersonatingExecutorService.java:71) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang. Thread .run( Thread .java:748) Dec 19, 2018 7:27:35 AM hudson.plugins.ec2.SlaveTemplate provisionSpot INFO: Launching ami-04750af7488d87g5c for template builder Dec 19, 2018 7:27:36 AM hudson.plugins.ec2.SlaveTemplate provisionSpot INFO: Spot instance id in provision: sir-8mki4f8k Dec 19, 2018 7:27:36 AM hudson.plugins.ec2.EC2RetentionStrategy start INFO: Start requested for builder (sir-8mki4f8k) Dec 19, 2018 7:27:36 AM hudson.plugins.ec2.EC2Cloud log INFO: Launching instance: null Dec 19, 2018 7:27:36 AM hudson.plugins.ec2.EC2Cloud log INFO: bootstrap() Dec 19, 2018 7:27:36 AM hudson.plugins.ec2.EC2Cloud log INFO: Getting keypair... Dec 19, 2018 7:27:37 AM hudson.plugins.ec2.EC2Cloud log INFO: Using key: ssh-key [................................] Dec 19, 2018 7:27:37 AM hudson.plugins.ec2.EC2Cloud log INFO: Authenticating as ec2-user Manually clicking  Launch Agent works and instances comes online fine.

          Rajat Umrao added a comment - - edited

          thoulen The issue still persist in Plugin 1.42 as well

          Issue occurs only using Spot instances, But work perfectly fine if use on-demand slaves.

          Apr 01, 2019 1:17:45 PM hudson.plugins.ec2.EC2Cloud INFO: Launching instance: null Apr 01, 2019 1:17:45 PM hudson.plugins.ec2.EC2Cloud INFO: bootstrap() Apr 01, 2019 1:17:45 PM hudson.plugins.ec2.EC2Cloud INFO: Getting keypair... Apr 01, 2019 1:17:45 PM hudson.plugins.ec2.EC2Cloud INFO: Using key: access-key aa:d0:04:24:13:b4:e5:d8:88:09:6a:55:cb:a7:43:fd ----BEGIN RSA PRIVATE KEY---- MIIEpQIBAAKCAQEAxKuhBhKMikwrGYB/gWH1pBqATDR0WYW60dD6PtsLO0k76+ ntd9Mi5IS4u+V0ANiRP3Kc6d+IjrH2KmL45Y9OHMZcPRj9AoMOxz0yTRYd3sDeW Apr 01, 2019 1:17:45 PM hudson.plugins.ec2.EC2Cloud INFO: Authenticating as root ERROR: Unexpected error in launching an agent. This is probably a bug in Jenkins java.lang.NullPointerException at hudson.plugins.ec2.ssh.EC2UnixLauncher.getEC2HostAddress(EC2UnixLauncher.java:369) at hudson.plugins.ec2.ssh.EC2UnixLauncher.connectToSsh(EC2UnixLauncher.java:319) at hudson.plugins.ec2.ssh.EC2UnixLauncher.bootstrap(EC2UnixLauncher.java:283) at hudson.plugins.ec2.ssh.EC2UnixLauncher.launchScript(EC2UnixLauncher.java:130) at hudson.plugins.ec2.EC2ComputerLauncher.launch(EC2ComputerLauncher.java:48) at hudson.slaves.SlaveComputer$1.call(SlaveComputer.java:294) at jenkins.util.ContextResettingExecutorService$2.call(ContextResettingExecutorService.java:46) at jenkins.security.ImpersonatingExecutorService$2.call(ImpersonatingExecutorService.java:71) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:748)
          

           

          Rajat Umrao added a comment - - edited thoulen The issue still persist in Plugin 1.42 as well Issue occurs only using Spot instances, But work perfectly fine if use on-demand slaves. Apr 01, 2019 1:17:45 PM hudson.plugins.ec2.EC2Cloud INFO: Launching instance: null Apr 01, 2019 1:17:45 PM hudson.plugins.ec2.EC2Cloud INFO: bootstrap() Apr 01, 2019 1:17:45 PM hudson.plugins.ec2.EC2Cloud INFO: Getting keypair... Apr 01, 2019 1:17:45 PM hudson.plugins.ec2.EC2Cloud INFO: Using key: access-key aa:d0:04:24:13:b4:e5:d8:88:09:6a:55:cb:a7:43:fd ----BEGIN RSA PRIVATE KEY---- MIIEpQIBAAKCAQEAxKuhBhKMikwrGYB/gWH1pBqATDR0WYW60dD6PtsLO0k76+ ntd9Mi5IS4u+V0ANiRP3Kc6d+IjrH2KmL45Y9OHMZcPRj9AoMOxz0yTRYd3sDeW Apr 01, 2019 1:17:45 PM hudson.plugins.ec2.EC2Cloud INFO: Authenticating as root ERROR: Unexpected error in launching an agent. This is probably a bug in Jenkins java.lang.NullPointerException at hudson.plugins.ec2.ssh.EC2UnixLauncher.getEC2HostAddress(EC2UnixLauncher.java:369) at hudson.plugins.ec2.ssh.EC2UnixLauncher.connectToSsh(EC2UnixLauncher.java:319) at hudson.plugins.ec2.ssh.EC2UnixLauncher.bootstrap(EC2UnixLauncher.java:283) at hudson.plugins.ec2.ssh.EC2UnixLauncher.launchScript(EC2UnixLauncher.java:130) at hudson.plugins.ec2.EC2ComputerLauncher.launch(EC2ComputerLauncher.java:48) at hudson.slaves.SlaveComputer$1.call(SlaveComputer.java:294) at jenkins.util.ContextResettingExecutorService$2.call(ContextResettingExecutorService.java:46) at jenkins.security.ImpersonatingExecutorService$2.call(ImpersonatingExecutorService.java:71) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang. Thread .run( Thread .java:748)  

          Jonathan B added a comment -

          Any updates on this? This failure to start spot instances makes it infeasible for us to upgrade past 1.39, but the occasional deadlocks when spawning instances (fixed in 1.42, apparently https://issues.jenkins-ci.org/browse/JENKINS-54187) are extremely disruptive.

          Jonathan B added a comment - Any updates on this? This failure to start spot instances makes it infeasible for us to upgrade past 1.39, but the occasional deadlocks when spawning instances (fixed in 1.42, apparently  https://issues.jenkins-ci.org/browse/JENKINS-54187 ) are extremely disruptive.

          Submitted a PR for this, if someone can take a look - https://github.com/jenkinsci/ec2-plugin/pull/367

          Michael Dodsworth added a comment - Submitted a PR for this, if someone can take a look -  https://github.com/jenkinsci/ec2-plugin/pull/367

          FABRIZIO MANFREDI added a comment - - edited

          merged in the 1.45, can you confirm that is fixed ? 

          FABRIZIO MANFREDI added a comment - - edited merged in the 1.45, can you confirm that is fixed ? 

          Gil Shinar added a comment -

          Installed 1.45 on my Jenkins. Upgraded from 1.43. Hopefully it'll solve the issue of slaves stuck in offline mode.

          Till today's morning, the issue was that the instances were up and running and the slaves were offline. Today's morning the instances were down as well as the slaves.

          Gil Shinar added a comment - Installed 1.45 on my Jenkins. Upgraded from 1.43. Hopefully it'll solve the issue of slaves stuck in offline mode. Till today's morning, the issue was that the instances were up and running and the slaves were offline. Today's morning the instances were down as well as the slaves.

          Gil Shinar added a comment -

          OK. Seems like it got worse. Usually it happened during the weekend or after a night. Now slaves fails to start during the day

          I'm out of ideas

          Gil Shinar added a comment - OK. Seems like it got worse. Usually it happened during the weekend or after a night. Now slaves fails to start during the day I'm out of ideas

          Gil Shinar added a comment -

          Yesterday morning I came and it looked fine. This morning, again, all instances were up and all slaves were offline while there were a few jobs in the queue waiting for a long time.

          It's not just not starting the slaves, it's even worse, the instances are running and wasting money which is the opposite of what this plugin is all about.

          Gil Shinar added a comment - Yesterday morning I came and it looked fine. This morning, again, all instances were up and all slaves were offline while there were a few jobs in the queue waiting for a long time. It's not just not starting the slaves, it's even worse, the instances are running and wasting money which is the opposite of what this plugin is all about.

          Gil Shinar added a comment -

          I have changed the configuration to use terminate/start instead of stop/start and for a week it works just fine. Maybe it might help solving this issue

          Gil Shinar added a comment - I have changed the configuration to use terminate/start instead of stop/start and for a week it works just fine. Maybe it might help solving this issue

          ovi craciun added a comment - - edited

          Jenkins ver. 2.204.2
          EC2 plugin: 1.48

          we see a similar problem, it crashes on getPrivateIpAddress, or getPublicIpAddress call depending on what I have chosen as connection strategy (public IP or private IP).
          the interesting part is this: it fails for spot inst images c5.2xlarge (or any higher, I made sure we bid enough to get those instances), if I use t2.2xlarge it works well.

          here's the error we are seeing

          ERROR: Unexpected error in launching an agent. This is probably a bug in Jenkins
          java.lang.NullPointerException
          	at hudson.plugins.ec2.EC2HostAddressProvider.getPrivateIpAddress(EC2HostAddressProvider.java:49)
          	at hudson.plugins.ec2.EC2HostAddressProvider.windows(EC2HostAddressProvider.java:28)
          	at hudson.plugins.ec2.win.EC2WindowsLauncher.connectToWinRM(EC2WindowsLauncher.java:134)
          	at hudson.plugins.ec2.win.EC2WindowsLauncher.launchScript(EC2WindowsLauncher.java:39)
          	at hudson.plugins.ec2.EC2ComputerLauncher.launch(EC2ComputerLauncher.java:48)
          	at hudson.slaves.SlaveComputer.lambda$_connect$0(SlaveComputer.java:291)
          	at jenkins.util.ContextResettingExecutorService$2.call(ContextResettingExecutorService.java:46)
          	at jenkins.security.ImpersonatingExecutorService$2.call(ImpersonatingExecutorService.java:71)
          	at java.util.concurrent.FutureTask.run(Unknown Source)
          	at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
          	at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
          	at java.lang.Thread.run(Unknown Source)
          

          Later edit: 
          we updated the EC2 plugin to version 1.49.1 and it doesn't repro anymore.

          ovi craciun added a comment - - edited Jenkins ver. 2.204.2 EC2 plugin: 1.48 we see a similar problem, it crashes on getPrivateIpAddress , or getPublicIpAddress call depending on what I have chosen as connection strategy (public IP or private IP). the interesting part is this: it fails for spot inst images c5.2xlarge (or any higher, I made sure we bid enough to get those instances), if I use t2.2xlarge it works well. here's the error we are seeing ERROR: Unexpected error in launching an agent. This is probably a bug in Jenkins java.lang.NullPointerException at hudson.plugins.ec2.EC2HostAddressProvider.getPrivateIpAddress(EC2HostAddressProvider.java:49) at hudson.plugins.ec2.EC2HostAddressProvider.windows(EC2HostAddressProvider.java:28) at hudson.plugins.ec2.win.EC2WindowsLauncher.connectToWinRM(EC2WindowsLauncher.java:134) at hudson.plugins.ec2.win.EC2WindowsLauncher.launchScript(EC2WindowsLauncher.java:39) at hudson.plugins.ec2.EC2ComputerLauncher.launch(EC2ComputerLauncher.java:48) at hudson.slaves.SlaveComputer.lambda$_connect$0(SlaveComputer.java:291) at jenkins.util.ContextResettingExecutorService$2.call(ContextResettingExecutorService.java:46) at jenkins.security.ImpersonatingExecutorService$2.call(ImpersonatingExecutorService.java:71) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang. Thread .run(Unknown Source) Later edit:  we updated the EC2 plugin to version  1.49.1  and it doesn't repro anymore.

          Ramon Leon added a comment -

          Closing as per latest comment

          Ramon Leon added a comment - Closing as per latest comment

            mramonleon Ramon Leon
            lifeofguenter Günter Grodotzki
            Votes:
            14 Vote for this issue
            Watchers:
            25 Start watching this issue

              Created:
              Updated:
              Resolved: