Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-46869

Can not register an EC2 instance as a node agent

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Critical Critical
    • ec2-plugin
    • None

      After last merges, when an Instance is created/located is not registering it as a node.

          [JENKINS-46869] Can not register an EC2 instance as a node agent

          Code changed in jenkins
          User: Alicia Doblas
          Path:
          src/main/java/hudson/plugins/ec2/EC2Cloud.java
          http://jenkins-ci.org/commit/ec2-plugin/447730b2b5a7748cc3414b0e6e3123e12b02d4e3
          Log:
          JENKINS-46869 Can not register instance as an agent (#238)

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Alicia Doblas Path: src/main/java/hudson/plugins/ec2/EC2Cloud.java http://jenkins-ci.org/commit/ec2-plugin/447730b2b5a7748cc3414b0e6e3123e12b02d4e3 Log: JENKINS-46869 Can not register instance as an agent (#238)

          marc young added a comment - - edited

          Ran into this today.

          1.37 shows as an update in plugin updates

          Spins up agents correctly but they do not show up as build agents and never get terminated. Log messages look fine, build queue backs up indefinitely

          marc young added a comment - - edited Ran into this today. 1.37 shows as an update in plugin updates Spins up agents correctly but they do not show up as build agents and never get terminated. Log messages look fine, build queue backs up indefinitely

          marc young added a comment -
          $ awslogs get ops-ci-master --start='2h' | grep 'InstanceId: i-00d8614bd54e4544d,ImageId: ami-bff' | grep 'Found existing pending or running' | wc -l
          1694
          
          $ awslogs get ops-ci-master --start='2h' | grep 'InstanceId: i-00d8614bd54e4544d,ImageId: ami-bff' | grep Creating\ new\ slave | wc -l
          847
          

          It never actually did anything, just kept checking, creating, checking, creating.

          Downgrading to 1.36 immediately resolved this

          marc young added a comment - $ awslogs get ops-ci-master --start= '2h' | grep 'InstanceId: i-00d8614bd54e4544d,ImageId: ami-bff' | grep 'Found existing pending or running' | wc -l 1694 $ awslogs get ops-ci-master --start= '2h' | grep 'InstanceId: i-00d8614bd54e4544d,ImageId: ami-bff' | grep Creating\ new \ slave | wc -l 847 It never actually did anything, just kept checking, creating, checking, creating. Downgrading to 1.36 immediately resolved this

          marc young added a comment -

          adoblas that commit did not resolve the issue per the released 1.37 version

          marc young added a comment - adoblas that commit did not resolve the issue per the released 1.37 version

          Alicia Doblas added a comment -

          myoung34 I'll try to look at this in the next couple of days.

          Alicia Doblas added a comment - myoung34 I'll try to look at this in the next couple of days.

          Alicia Doblas added a comment -

          myoung34 After the upgrade, deleting the existing Jenkins EC2 plugin configuration (instances setup and cloud config) and re creating it seems to fix the problem.

          You can follow the issue in https://issues.jenkins-ci.org/browse/JENKINS-47130

          Alicia Doblas added a comment - myoung34 After the upgrade, deleting the existing Jenkins EC2 plugin configuration (instances setup and cloud config) and re creating it seems to fix the problem. You can follow the issue in https://issues.jenkins-ci.org/browse/JENKINS-47130

          marc young added a comment -

          adoblas this is still an issue for me here.
          I cant delete the plugin configuration and recreate, my configuration has a bunch of info in it re: labels/ami ids/etc.
          Im going to stick with 1.36 until the issue is resolved, its not worth the effort to recreate my configuration for a broken update

          marc young added a comment - adoblas this is still an issue for me here. I cant delete the plugin configuration and recreate, my configuration has a bunch of info in it re: labels/ami ids/etc. Im going to stick with 1.36 until the issue is resolved, its not worth the effort to recreate my configuration for a broken update

          Alicia Doblas added a comment -

          myoung34 I completly understand. I'll try to find some time to fix this.

          Alicia Doblas added a comment - myoung34 I completly understand. I'll try to find some time to fix this.

          David Hayes added a comment - - edited

          Saw this issue on 1.37 with core 2.73.3 also. Jenkins would launch a slave correctly, detect it's existence but never create a corresponding node for the created instance.

           

          Removing the existing cloud configuration and re-entering it resolved the issue, with slaves launching as expected afterwards. 

           

          After re-entering the configuration, it was noted that the new config.xml contained  <node>true</node> for each node (previously set to false)

          David Hayes added a comment - - edited Saw this issue on 1.37 with core 2.73.3 also. Jenkins would launch a slave correctly, detect it's existence but never create a corresponding node for the created instance.   Removing the existing cloud configuration and re-entering it resolved the issue, with slaves launching as expected afterwards.    After re-entering the configuration, it was noted that the new config.xml contained  <node>true</node> for each node (previously set to false)

          Francis Upton added a comment -

          I see the problem here. This was caused by the "node" field that was added to the SlaveTemplate https://github.com/jenkinsci/ec2-plugin/pull/232. The initial value of "note" was set to "true", allowing it to be provisioned normally. However, any serialized SlaveTemplates would (incorrectly) have the value of "false" which caused them to be ignored during the normal provisioning. The work of the PR did not consider this case.

          I'm going to replace this by a different mechanism which more explicitly marks pipeline step created nodes so they can be ignored.

          Francis Upton added a comment - I see the problem here. This was caused by the "node" field that was added to the SlaveTemplate https://github.com/jenkinsci/ec2-plugin/pull/232 . The initial value of "note" was set to "true", allowing it to be provisioned normally. However, any serialized SlaveTemplates would (incorrectly) have the value of "false" which caused them to be ignored during the normal provisioning. The work of the PR did not consider this case. I'm going to replace this by a different mechanism which more explicitly marks pipeline step created nodes so they can be ignored.

            francisu Francis Upton
            adoblas Alicia Doblas
            Votes:
            0 Vote for this issue
            Watchers:
            10 Start watching this issue

              Created:
              Updated:
              Resolved: