Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-73353

Jenkins controller restart causes ec2 clouds to fail launching new agents

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Critical Critical
    • ec2-plugin
    • None
    • Jenkins version: 2.452.2
      amazonEC2 plugin version: 1688.v8c07e01d657f
      Configuration as Code plugin version: 1810.v9b_c30a_249a_4c

      We have seceral clouds configured via the configuration as code plugin for amazonEC2. Every time Jenkins restarts, we have to go into each cloud configuration and save. If we don't do this, Jenkins will fail to launch any new agent on EC2. See agent log output attached.

      This looks to be either related or the same as https://issues.jenkins.io/browse/JENKINS-56066. Although the stack trace is different.

      Example yaml config for one of our clouds:

      - "amazonEC2":
            "name": "spot-small"
            "noDelayProvisioning": true
            "region": "eu-west-1"
            "sshKeysCredentialsId": "ssh-private-key"
            "templates":
            - "ami": "ami-<redcated>"
              "amiType":
                "unixData":
                  "sshPort": "22"
              "associatePublicIp": false
              "connectBySSHProcess": true
              "connectionStrategy": "PRIVATE_IP"
              "customDeviceMapping": "/dev/sda1=:50"
              "deleteRootOnTermination": true
              "description": "spot-small"
              "ebsEncryptRootVolume": "DEFAULT"
              "ebsOptimized": false
              "hostKeyVerificationStrategy": "CHECK_NEW_HARD"
              "iamInstanceProfile": "<redacted>"
              "idleTerminationMinutes": 1
              "instanceCapStr": 10
              "javaPath": "java"
              "labelString": "spot-small"
              "maxTotalUses": 5
              "metadataEndpointEnabled": true
              "metadataHopsLimit": 1
              "metadataSupported": true
              "metadataTokensRequired": false
              "minimumNumberOfInstances": 0
              "minimumNumberOfSpareInstances": 0
              "mode": "EXCLUSIVE"
              "monitoring": true
              "numExecutors": 1
              "remoteAdmin": "<redacted>"
              "securityGroups": "<redacted>"
              "spotConfig":
                "spotMaxBidPrice": "0.020"
                "useBidPrice": true
              "stopOnTerminate": true
              "subnetId": "<redacted>,<redacted>,<redacted>"
              "t2Unlimited": false
              "tags":
              - "name": "Name"
                "value": "spot-small"
              "tenancy": "Default"
              "type": "T3Small"
              "useEphemeralDevices": false
            "useInstanceProfileForCredentials": false

      Jenkins is running in AWS ECS Fargate using image jenkins/jenkins:lts-jdk17 with AWS EFS for persisted storage.

      CPU: 2 vCPU
      Memory: 4GB

      Note, in the Jenkins Controller logs on startup I see stack traces with this pattern:

      WARNING jenkins.model.Nodes#load: could not load /var/jenkins_home/nodes/<redacted> (i-<redacted>)
          java.nio.file.NoSuchFileException: /var/jenkins_home/nodes/<redacted> (i-<redacted>)/config.xml
          at java.base/sun.nio.fs.UnixException.translateToIOException(Unknown Source)
          at java.base/sun.nio.fs.UnixException.rethrowAsIOException(Unknown Source)
          at java.base/sun.nio.fs.UnixException.rethrowAsIOException(Unknown Source)
          at java.base/sun.nio.fs.UnixFileSystemProvider.newByteChannel(Unknown Source)
          at java.base/java.nio.file.Files.newByteChannel(Unknown Source)
          at java.base/java.nio.file.Files.newByteChannel(Unknown Source)
          at java.base/java.nio.file.spi.FileSystemProvider.newInputStream(Unknown Source)
          at java.base/java.nio.file.Files.newInputStream(Unknown Source)
          at hudson.XmlFile.read(XmlFile.java:164)
          at jenkins.model.Nodes.load(Nodes.java:393)
          at jenkins.model.Nodes.load(Nodes.java:340)
          at jenkins.model.Jenkins$12.run(Jenkins.java:3511)
          at org.jvnet.hudson.reactor.TaskGraphBuilder$TaskImpl.run(TaskGraphBuilder.java:177)
          at org.jvnet.hudson.reactor.Reactor.runTask(Reactor.java:305)
          at jenkins.model.Jenkins$5.runTask(Jenkins.java:1175)
          at org.jvnet.hudson.reactor.Reactor$2.run(Reactor.java:221)
          at org.jvnet.hudson.reactor.Reactor$Node.run(Reactor.java:120)
          at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:68)
          at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
          at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
          at java.base/java.lang.Thread.run(Unknown Source)

            thoulen FABRIZIO MANFREDI
            tb0283970 Thomas
            Votes:
            2 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated: