Uploaded image for project: 'Infrastructure'
  1. Infrastructure
  2. INFRA-2904

Configure infra.jenkins.io to provision ec2 instances for arm64 Linux + Windows amd64

    XMLWordPrintable

    Details

    • Similar Issues:

      Description

      Configure infra.jenkins.io to provision ec2 instances for arm64 build.

      Instances need to be as closed as possible to azure east-us 2.

      Todo list:

      • Create an AWS user "jenkins-infra-ci" with an IAM profile "jenkins-ec2-agents" to be used for the ec2 instances from infra.ci - https://plugins.jenkins.io/ec2/#user-content-iam-setup
      • Ensure that the ec2 plugin is installed on infra.ci
      • Ensure that the user "jenkins-infra" 's credentials are loaded into SOPS + infra.ci
      • Create an key pair for EC2 agents named "jenkins-infra-agents"
      • Ensure that the SSH private key associated to the key pair "jenkins-infra" is loaded into SOPS + infra.ci
      • Create EC2 security group to restrict ingress from infra.ci and egress to http/https/dns/ssh
      • Apply the configuration
      • check the instance sizing (no SPOT instance as indra.ci is not exposed publicly)

        Attachments

          Issue Links

            Activity

            Hide
            dduportal Damien Duportal added a comment -

            Proposal for the configurations:

            • ARM configuration should use the same AMI as ci.jenkins.io currently (for the scope of this issue) - Ubuntu 18.04. Proposal to use instance type of t4g.medium (2 vCPUs, 4 Gb) - https://aws.amazon.com/ec2/instance-types/t4/
            • Windows configuration should use the same AMI as ci.jenkins.io currently (for the scope of this issue) - Wibndows Server 2019. Proposal to use instance type of t3.medium (2 vCPUs, 4 Gb) - https://aws.amazon.com/ec2/instance-types/t3/
            Show
            dduportal Damien Duportal added a comment - Proposal for the configurations: ARM configuration should use the same AMI as ci.jenkins.io currently (for the scope of this issue) - Ubuntu 18.04. Proposal to use instance type of t4g.medium (2 vCPUs, 4 Gb) - https://aws.amazon.com/ec2/instance-types/t4/ Windows configuration should use the same AMI as ci.jenkins.io currently (for the scope of this issue) - Wibndows Server 2019. Proposal to use instance type of t3.medium (2 vCPUs, 4 Gb) - https://aws.amazon.com/ec2/instance-types/t3/
            Hide
            dduportal Damien Duportal added a comment -
            Show
            dduportal Damien Duportal added a comment - IAM user and role created in https://github.com/jenkins-infra/terraform-states/commit/ed94e04ffe2d433c5b5df2f3aab388e166304222 (private and restricted repository) SSH key created, and the private key had been addded, along with AWS API credentials for the IAM user, in the SOPS secrets for Jenkins-infra: https://github.com/jenkins-infra/charts-secrets/commit/31d56c411c312ed494ada0fd65623a68c85b5b35 (private and restricted repository) PR to add the credentials on infra.ci: https://github.com/jenkins-infra/charts/pull/1067
            Hide
            dduportal Damien Duportal added a comment - - edited
            Show
            dduportal Damien Duportal added a comment - - edited EC2 plugin is installed on infra.ci: https://github.com/jenkins-infra/docker-jenkins-weekly/blob/main/plugins.txt#L34
            Hide
            dduportal Damien Duportal added a comment -
            Show
            dduportal Damien Duportal added a comment - Key pair created on AWS: https://github.com/jenkins-infra/aws/pull/20
            Hide
            dduportal Damien Duportal added a comment -

            Alas, the Ed25519 format for SSH key is not supported by AWS (only RSA), so I had to switch the key pair:

            Show
            dduportal Damien Duportal added a comment - Alas, the Ed25519 format for SSH key is not supported by AWS (only RSA), so I had to switch the key pair: SOPS Secrets (priv key only): https://github.com/jenkins-infra/charts-secrets/commit/08fffef516acc1fd833a5f9632408743aadb1cb6 Public key in AWS: https://github.com/jenkins-infra/aws/commit/46e70211b00a4670ec0aa934dc4942662d6b449e
            Hide
            dduportal Damien Duportal added a comment -

            And the private key used by the ec2 plugin MUST be in pem format:

            Show
            dduportal Damien Duportal added a comment - And the private key used by the ec2 plugin MUST be in pem format: https://github.com/jenkins-infra/charts-secrets/commit/8c0c3eeefcb89e6a65f080a022d352a111c1e261
            Show
            dduportal Damien Duportal added a comment - Required security group: https://github.com/jenkins-infra/aws/commit/f410e2aa721dc6b43d71267513effab299edf954
            Hide
            dduportal Damien Duportal added a comment -

            PR to add configuration on infra.ci: https://github.com/jenkins-infra/charts/pull/1080/files

            Show
            dduportal Damien Duportal added a comment - PR to add configuration on infra.ci: https://github.com/jenkins-infra/charts/pull/1080/files
            Hide
            dduportal Damien Duportal added a comment - - edited

            Deployed to production:

            • ARM is working well
            • Windows is not able to start the agent process:
              • Machines are started and provisionned
              • Machines are reachable with RDP, docker is up on the machines, java is installed
              • Agent startup fails with an SSH authentication error. We are using the 1.21 (not the latest) ssh-agent plugin.
              • Authentication error can be reproduced with a local SSH: it is NOT a Jenkins issues, but a Windows packer's image issue.

            => As Windows agent is working on ci.jenkins.io, the diff will tell us what did I change that broke the behavior. As for now, it sounds like that the startup timeout is the culprit.

            Show
            dduportal Damien Duportal added a comment - - edited Deployed to production: ARM is working well Windows is not able to start the agent process: Machines are started and provisionned Machines are reachable with RDP, docker is up on the machines, java is installed Agent startup fails with an SSH authentication error. We are using the 1.21 (not the latest) ssh-agent plugin. Authentication error can be reproduced with a local SSH: it is NOT a Jenkins issues, but a Windows packer's image issue. => As Windows agent is working on ci.jenkins.io, the diff will tell us what did I change that broke the behavior. As for now, it sounds like that the startup timeout is the culprit.
            Hide
            dduportal Damien Duportal added a comment -

            Created https://issues.jenkins.io/browse/INFRA-2952 to log all improvement tasks outside the scope of this

            Show
            dduportal Damien Duportal added a comment - Created https://issues.jenkins.io/browse/INFRA-2952 to log all improvement tasks outside the scope of this
            Hide
            dduportal Damien Duportal added a comment -

            Timeout was the issue.

            Show
            dduportal Damien Duportal added a comment - Timeout was the issue.

              People

              Assignee:
              dduportal Damien Duportal
              Reporter:
              dduportal Damien Duportal
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: