Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-70446

Amazon EC2 nodes created and connected but the job is ran on the master.

    • Icon: Task Task
    • Resolution: Unresolved
    • Icon: Major Major
    • ec2-plugin
    • None
    • Jenkins 2.346.2(m4.4xlarge), Amazon EC2 Version2.0.4, openjdk 11.0.16 2022-07-19 LTS, Agent (C5Large)

      Currently, We have one amazon account linked with the plugin with multiple AMI's for windows and Linux. We don't see any issue with Windows but for the Linux agent we are occasionally seeing the below problem. Our nodes are set to limit of 5 Uses.

      Problem:

      • Nodes are created as per when the resource is per the resource requirement.
      • On a rare occasion, the Jenkins Coordinator creates a new Jenkins EC2 Agent, however the IP address assigned to the “Jenkins Worker Node” is the same as the Jenkins Coordinator IP address

      Debugging:

      • Looked at the AMI to verify all the versions are right and installed correctly. Looks good.
      • Added more print statements to debug more like printing hostname and IP address.

       

      Jenkins Master --> m4.4xlarge 

      Linux Agent --> C5Large

      Note: This happens occasionally sometimes 3 to 4 times a week. We even checked the load of the Jenkins Master to see if there is any pattern but found none.

      Please let me know if you need anymore information from me.

          [JENKINS-70446] Amazon EC2 nodes created and connected but the job is ran on the master.

          Vamshidher Reddy created issue -
          Vamshidher Reddy made changes -
          Description Original: Currently, We have one amazon account linked with the plugin with multiple AMI's for windows and Linux. We don't see any issue with Windows but for the Linux agent we are occasionally seeing the below problem. Our nodes are set to limit of 5 Uses.

          Problem:
           * Nodes are created as per when the resource is per the resource requirement.
           * When the node comes online it has sanity check stage which checks all the required versions installed in the machine.
           * The job fails on the same stage saying all the required versions are not installed.

           

          Debugging:
           * Looked at the AMI to verify all the versions are right and installed correctly. Looks good.
           * We have added slack notification as soon as the job failed tried to ssh to the agent directly and checked if the versions were installed. Looks good on the same agent where jenkins job was failing.
           * Added more print statements to debug more like printing hostname and IP address.

           

          Actual Problem found:
           * Jenkins is creating agent correctly but the master instead of connecting to the actual agent it is connecting to the master itself and running the agent commands on the master itself even though the logs says the job is running on the agent. 

          Jenkins Master --> m4.4xlarge 

          Linux Agent --> C5Large

          Note: This happens occasionally sometimes 3 to 4 times a week. We even checked the load of the Jenkins Master to see if there is any pattern but found none.


          Please let me know if you need anymore information from me.
          New: Currently, We have one amazon account linked with the plugin with multiple AMI's for windows and Linux. We don't see any issue with Windows but for the Linux agent we are occasionally seeing the below problem. Our nodes are set to limit of 5 Uses.

          Problem:
           * Nodes are created as per when the resource is per the resource requirement.
           * The job fails on the same stage saying all the required versions are not installed.

           

          Debugging:
           * Looked at the AMI to verify all the versions are right and installed correctly. Looks good.
           * Added more print statements to debug more like printing hostname and IP address.

           

          Actual Problem found:
           * On a rare occasion, the Jenkins Coordinator creates a new Jenkins EC2 Agent, however the IP address assigned to the “Jenkins Worker Node” is the same as the Jenkins Coordinator IP address

          Jenkins Master --> m4.4xlarge 

          Linux Agent --> C5Large

          Note: This happens occasionally sometimes 3 to 4 times a week. We even checked the load of the Jenkins Master to see if there is any pattern but found none.

          Please let me know if you need anymore information from me.
          Vamshidher Reddy made changes -
          Description Original: Currently, We have one amazon account linked with the plugin with multiple AMI's for windows and Linux. We don't see any issue with Windows but for the Linux agent we are occasionally seeing the below problem. Our nodes are set to limit of 5 Uses.

          Problem:
           * Nodes are created as per when the resource is per the resource requirement.
           * The job fails on the same stage saying all the required versions are not installed.

           

          Debugging:
           * Looked at the AMI to verify all the versions are right and installed correctly. Looks good.
           * Added more print statements to debug more like printing hostname and IP address.

           

          Actual Problem found:
           * On a rare occasion, the Jenkins Coordinator creates a new Jenkins EC2 Agent, however the IP address assigned to the “Jenkins Worker Node” is the same as the Jenkins Coordinator IP address

          Jenkins Master --> m4.4xlarge 

          Linux Agent --> C5Large

          Note: This happens occasionally sometimes 3 to 4 times a week. We even checked the load of the Jenkins Master to see if there is any pattern but found none.

          Please let me know if you need anymore information from me.
          New: Currently, We have one amazon account linked with the plugin with multiple AMI's for windows and Linux. We don't see any issue with Windows but for the Linux agent we are occasionally seeing the below problem. Our nodes are set to limit of 5 Uses.

          Problem:
           * Nodes are created as per when the resource is per the resource requirement.
           * On a rare occasion, the Jenkins Coordinator creates a new Jenkins EC2 Agent, however the IP address assigned to the “Jenkins Worker Node” is the same as the Jenkins Coordinator IP address

          Debugging:
           * Looked at the AMI to verify all the versions are right and installed correctly. Looks good.
           * Added more print statements to debug more like printing hostname and IP address.

           

          Jenkins Master --> m4.4xlarge 

          Linux Agent --> C5Large

          Note: This happens occasionally sometimes 3 to 4 times a week. We even checked the load of the Jenkins Master to see if there is any pattern but found none.

          Please let me know if you need anymore information from me.

          Steve added a comment -

          I'm also seeing this problem. Here is a log line from a broken agent that seems relevant:

          INFO: Connecting to null on port 22, with timeout 10000. 

          Working nodes will say something like this:

          INFO: Connecting to 200.200.200.200 on port 22, with timeout 10000. 

          Steve added a comment - I'm also seeing this problem. Here is a log line from a broken agent that seems relevant: INFO: Connecting to null on port 22, with timeout 10000. Working nodes will say something like this: INFO: Connecting to 200.200.200.200 on port 22, with timeout 10000.

            thoulen FABRIZIO MANFREDI
            vjannapureddy Vamshidher Reddy
            Votes:
            1 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated: