Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-43771

EC2 Plugin: Jenkins doesn't wait for cloud-init to finish

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Major Major
    • ec2-plugin
    • None
    • CloudBees Jenkins Enterprise 2.32.3.2-rolling,
      Amazon EC2 plugin 1.36

      When Jenkins creates a new EC2 agent, it logins in through ssh and runs slave.jar before cloud-init has finished running the commands in User Data. This can cause builds to fail if they try to perform actions that require setup done by cloud-init. Jenkins should wait until cloud-init is finished before allowing the agent to be used for a build.

          [JENKINS-43771] EC2 Plugin: Jenkins doesn't wait for cloud-init to finish

          This may be what I am facing too. We have a deploy node that has an elastic IP assigned to it in the init script. The problem is that when the build pipeline goes to send a job to the deploy node, we get errors like:

          hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel

           

          AND

           

          ava.lang.InterruptedException
          at java.lang.Object.wait(Native Method)
          at hudson.remoting.Request.call(Request.java:147)
          at hudson.remoting.Channel.call(Channel.java:829)
          at hudson.EnvVars.getRemote(EnvVars.java:405)
          at hudson.model.Computer.getEnvironment(Computer.java:1147)
          at org.jenkinsci.plugins.workflow.support.steps.ExecutorStepExecution$PlaceholderTask$PlaceholderExecutable.run(ExecutorStepExecution.java:528)
          at hudson.model.ResourceController.execute(ResourceController.java:98)
          at hudson.model.Executor.run(Executor.java:405)
          Finished: FAILURE

           

          The only way to solve these problems is to reboot the Jenkins service. This somehow makes the deploy node work again.

          Joseph Maxwell added a comment - This may be what I am facing too. We have a deploy node that has an elastic IP assigned to it in the init script. The problem is that when the build pipeline goes to send a job to the deploy node, we get errors like: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel   AND   ava.lang.InterruptedException at java.lang.Object.wait(Native Method) at hudson.remoting.Request.call(Request.java:147) at hudson.remoting.Channel.call(Channel.java:829) at hudson.EnvVars.getRemote(EnvVars.java:405) at hudson.model.Computer.getEnvironment(Computer.java:1147) at org.jenkinsci.plugins.workflow.support.steps.ExecutorStepExecution$PlaceholderTask$PlaceholderExecutable.run(ExecutorStepExecution.java:528) at hudson.model.ResourceController.execute(ResourceController.java:98) at hudson.model.Executor.run(Executor.java:405) Finished: FAILURE   The only way to solve these problems is to reboot the Jenkins service. This somehow makes the deploy node work again.

          Alan added a comment -

          i think this is a duplicate of my issue: https://issues.jenkins-ci.org/browse/JENKINS-43064

          looking at the code, it only looks for running state, not the status, which is a different call and the one that signifies if the user data has fully executed.

          currently our only resolution is to have a golden image with anything required, not ideal.

          Alan added a comment - i think this is a duplicate of my issue: https://issues.jenkins-ci.org/browse/JENKINS-43064 looking at the code, it only looks for running state, not the status, which is a different call and the one that signifies if the user data has fully executed. currently our only resolution is to have a golden image with anything required, not ideal.

          simple workarounds:

          1) move scripts from "User Data" to "Init script" section

          2) keep scripts in "User Data" and add the following script to "Init script" section

          until sudo grep "Cloud-init .* finished" /var/log/cloud-init-output.log; do
              echo wait for cloud-init
              sleep 1
          done

          Mykola Marzhan added a comment - simple workarounds: 1) move scripts from "User Data" to "Init script" section 2) keep scripts in "User Data" and add the following script to "Init script" section until sudo grep "Cloud-init .* finished" / var /log/cloud-init-output.log; do echo wait for cloud-init sleep 1 done

          Ben Dean added a comment -

          Another simpler version of the second part of mykola's workaround is to have this in your init script:

          sudo cloud-init status --wait
          

          Ben Dean added a comment - Another simpler version of the second part of mykola 's workaround is to have this in your init script: sudo cloud-init status --wait

            francisu Francis Upton
            kwayman Kyle Wayman
            Votes:
            7 Vote for this issue
            Watchers:
            11 Start watching this issue

              Created:
              Updated: