• Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Blocker Blocker
    • jclouds-plugin
    • None
    • Jenkins 1.466.2, JClouds Plug-In 2.3.1, Ubuntu 12.04 x64. Connecting to Amazon EC2 trying to run an Ubuntu 12.04 image

      I had trouble using the base Ubuntu 12.04 image on Amazon (ami-3d4ff254), so I've been making my own ami based on it that has Java and the build tools I need pre-installed. I have configured JClouds with "Use Pre-existing user" and "Use Pre-installed Java". The plug-in starts the machine, connects to it, then I get:

      [#|2012-11-05T16:12:48.801-0500|WARNING|oracle-glassfish3.1.1|hudson.slaves.NodeProvisioner|_ThreadID=26;_ThreadName=Thread-2;|Provisioned slave testaws failed to launch
      java.util.concurrent.ExecutionException: java.io.IOException: Error during SCP transfer.
      at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
      at java.util.concurrent.FutureTask.get(FutureTask.java:83)
      at jenkins.plugins.jclouds.compute.JCloudsCloud$2.call(JCloudsCloud.java:227)
      at jenkins.plugins.jclouds.compute.JCloudsCloud$2.call(JCloudsCloud.java:213)
      at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
      at java.util.concurrent.FutureTask.run(FutureTask.java:138)
      at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
      at java.lang.Thread.run(Thread.java:662)
      Caused by: java.io.IOException: Error during SCP transfer.
      at com.trilead.ssh2.SCPClient.put(SCPClient.java:510)
      at com.trilead.ssh2.SCPClient.put(SCPClient.java:466)
      at jenkins.plugins.jclouds.compute.JCloudsLauncher.launch(JCloudsLauncher.java:71)
      at hudson.slaves.SlaveComputer$1.call(SlaveComputer.java:200)
      ... 5 more
      Caused by: java.io.IOException: Remote scp sent illegal error code.
      at com.trilead.ssh2.SCPClient.readResponse(SCPClient.java:53)
      at com.trilead.ssh2.SCPClient.sendBytes(SCPClient.java:137)
      at com.trilead.ssh2.SCPClient.put(SCPClient.java:506)
      ... 8 more

      #]

      The failing line (JCloudsLauncher:71) appears to be trying to copy Slave.jar to /tmp on the node. I can manually scp this file and the manual scp succeeds. Any ideas on how to debug this?

          [JENKINS-15727] JClouds Plug-In failing to start slave

          Andrew Bayer added a comment -

          Any chance you could get on the instance after that happens and see what shows up in /var/log/secure, say?

          Andrew Bayer added a comment - Any chance you could get on the instance after that happens and see what shows up in /var/log/secure, say?

          Nathan Sharp added a comment -

          Sure enough! I found some logs in /var/log/auth.log mentioning attempted logins to root. Couldn't figure how how to have the plug-in log in via the 'ubuntu' user but still use the public key authentication. But by setting disable_root to 0 in /etc/cloud/cloud.cfg in my amazon image, I seem to be past that now I have a simple job running, but still having some problems. When the first job is queued, it starts a new amazon machine to service it. However if I put a second job in the queue, when it tries to start the second amazon machine I get this stack trace:

          [#|2012-11-06T11:48:08.801-0500|WARNING|oracle-glassfish3.1.1|hudson.slaves.NodeProvisioner|_ThreadID=26;_ThreadName=Thread-2;|Provisioned slave testaws failed to launch
          java.lang.NullPointerException
          at java.util.TreeMap.put(TreeMap.java:541)
          at java.util.TreeSet.add(TreeSet.java:238)
          at hudson.model.Node.getAssignedLabels(Node.java:240)
          at hudson.model.Slave.<init>(Slave.java:154)
          at hudson.model.Slave.<init>(Slave.java:132)
          at jenkins.plugins.jclouds.compute.JCloudsSlave.<init>(JCloudsSlave.java:47)
          at jenkins.plugins.jclouds.compute.JCloudsSlave.<init>(JCloudsSlave.java:70)
          at jenkins.plugins.jclouds.compute.JCloudsSlaveTemplate.provisionSlave(JCloudsSlaveTemplate.java:163)
          at jenkins.plugins.jclouds.compute.JCloudsCloud$2.call(JCloudsCloud.java:216)
          at jenkins.plugins.jclouds.compute.JCloudsCloud$2.call(JCloudsCloud.java:213)
          at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
          at java.util.concurrent.FutureTask.run(FutureTask.java:138)
          at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
          at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
          at java.lang.Thread.run(Thread.java:662)

          #]

          Nathan Sharp added a comment - Sure enough! I found some logs in /var/log/auth.log mentioning attempted logins to root. Couldn't figure how how to have the plug-in log in via the 'ubuntu' user but still use the public key authentication. But by setting disable_root to 0 in /etc/cloud/cloud.cfg in my amazon image, I seem to be past that now I have a simple job running, but still having some problems. When the first job is queued, it starts a new amazon machine to service it. However if I put a second job in the queue, when it tries to start the second amazon machine I get this stack trace: [#|2012-11-06T11:48:08.801-0500|WARNING|oracle-glassfish3.1.1|hudson.slaves.NodeProvisioner|_ThreadID=26;_ThreadName=Thread-2;|Provisioned slave testaws failed to launch java.lang.NullPointerException at java.util.TreeMap.put(TreeMap.java:541) at java.util.TreeSet.add(TreeSet.java:238) at hudson.model.Node.getAssignedLabels(Node.java:240) at hudson.model.Slave.<init>(Slave.java:154) at hudson.model.Slave.<init>(Slave.java:132) at jenkins.plugins.jclouds.compute.JCloudsSlave.<init>(JCloudsSlave.java:47) at jenkins.plugins.jclouds.compute.JCloudsSlave.<init>(JCloudsSlave.java:70) at jenkins.plugins.jclouds.compute.JCloudsSlaveTemplate.provisionSlave(JCloudsSlaveTemplate.java:163) at jenkins.plugins.jclouds.compute.JCloudsCloud$2.call(JCloudsCloud.java:216) at jenkins.plugins.jclouds.compute.JCloudsCloud$2.call(JCloudsCloud.java:213) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) #]

          Andrew Bayer added a comment -

          Any chance you could attach the jclouds section of your config.xml?

          Andrew Bayer added a comment - Any chance you could attach the jclouds section of your config.xml?

          Nathan Sharp added a comment -

          jclouds section of config.xml, with private key and authentication stuff removed.

          Nathan Sharp added a comment - jclouds section of config.xml, with private key and authentication stuff removed.

          Nathan Sharp added a comment -

          If you give me your amazon user number, I can share the AMI with you.

          Nathan Sharp added a comment - If you give me your amazon user number, I can share the AMI with you.

          Andrew Bayer added a comment -

          Is that the account number?

          Andrew Bayer added a comment - Is that the account number?

          Nathan Sharp added a comment -

          I think so? I've never shared an AMI before. The field I have to enter is labeled "AWS Account Number". Thanks for all your help!

          Nathan Sharp added a comment - I think so? I've never shared an AMI before. The field I have to enter is labeled "AWS Account Number". Thanks for all your help!

          Andrew Bayer added a comment -

          0041-2815-1137, then. =) I'll try to give this a spin later today/early tomorrow - I couldn't see anything obviously wrong in the config, and it's probably not the specific AMI, but it's worth a try.

          Andrew Bayer added a comment - 0041-2815-1137, then. =) I'll try to give this a spin later today/early tomorrow - I couldn't see anything obviously wrong in the config, and it's probably not the specific AMI, but it's worth a try.

          Nathan Sharp added a comment -

          Thanks, you should have permission to access the AMI now.

          I just tried it again to make sure I wasn't crazy. This time I don't get any exceptions. According to the log it starts up 3 slaves on the cloud, but they don't appear in the GUI, and the tied job is still hanging. How is it supposed to differentiate different instances of the same "testaws" node?

          Again, thanks so much for your help.

          Nathan Sharp added a comment - Thanks, you should have permission to access the AMI now. I just tried it again to make sure I wasn't crazy. This time I don't get any exceptions. According to the log it starts up 3 slaves on the cloud, but they don't appear in the GUI, and the tied job is still hanging. How is it supposed to differentiate different instances of the same "testaws" node? Again, thanks so much for your help.

          Fritz Elfert added a comment -

          Out of date.
          If the problem persists with the upcoming v2.9, please open a new issue.

          Fritz Elfert added a comment - Out of date. If the problem persists with the upcoming v2.9, please open a new issue.

            abayer Andrew Bayer
            nsharp Nathan Sharp
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: