Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-71420

openstack-cloud-plugin deletes the node when the node goes for a reboot

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Blocker Blocker
    • openstack-cloud-plugin
    • None

      openstack-cloud-plugin version: 

      I have a pipeline script where I need to reboot the node before running the tests. I use AWS and OpenStack instances to run the test. No issue is found when I use AWS instances using the ec2 plugin on Jenkins. When an OpenStack instance is used, the node is deleted as soon as I reboot the instance. This issue has become a blocker for me since the pipeline fails to run on OpenStack.

      From Jenkins log:

      2023-06-09 02:44:47.823+0000 [id=4709]  INFO    h.r.SynchronousCommandTransport$ReaderThread#run: I/O error in channel Ubuntu 22.04 LTS-47
      java.io.EOFException
              at java.base/java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:3034)
              at java.base/java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:3529)
              at java.base/java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:1071)
              at java.base/java.io.ObjectInputStream.<init>(ObjectInputStream.java:493)
              at hudson.remoting.ObjectInputStreamEx.<init>(ObjectInputStreamEx.java:49)
              at hudson.remoting.Command.readFrom(Command.java:142)
              at hudson.remoting.Command.readFrom(Command.java:128)
              at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:35)
              at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:61)
      Caused: java.io.IOException: Unexpected termination of the channel
              at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:75)
      2023-06-09 02:46:04.923+0000 [id=4711]  WARNING j.p.o.c.JCloudsCleanupThread#terminateNodesPendingDeletion: Deleting broken node Ubuntu 22.04 LTS-47 (NovaServer{id=c24c0202-17e6-4aac-9cd9-3227bbba5946, name=Ubuntu 22.04 LTS-47, image={id=015d6c77-f8a3-4542-9910-478997552421, links=[{href=<OpenStack URL>, rel=bookmark}]}, flavor=NovaFlavor{id=<ID>, ephemeral=0, swap=0, rxtx_factor=1.0, links=[GenericLink{href=<OpenStack URL>, rel=bookmark}],
      }, status=ACTIVE, diskconfig=MANUAL, userId=<userID>, created=Fri Jun 09 02:26:05 UTC 2023, updated=Fri Jun 09 02:26:18 UTC 2023, launched at=Fri Jun 09 02:26:18 UTC 2023, tenantId=<tenantID>, hostId=<hostID>, addresses=NovaAddresses{addresses={HQ-LAN=[NovaAddress{address=<id_addr>, type=fixed, version=4, macaddr=<macaddr>,
      }]},
      }, hypervisor host=os-berserker, powerstate=1, instanceName=instance-0001f1b7, vmState=active, metadata={jenkins-boot-image-id=<image_id>, jenkins-cloud-name=openstack, jenkins-template-name=Ubuntu 22.04 LTS, jenkins-instance=<Jenkins-URL>, jenkins-identity=<jenkins-identity>, jenkins-scope=node:Ubuntu 22.04 LTS-47:-1606364798, jenkins-boot-source=Image ubuntu-22.04-20220114}}). Reason: Connection was broken: java.io.EOFException
              at java.base/java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:3034)
              at java.base/java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:3529)
              at java.base/java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:1071)
              at java.base/java.io.ObjectInputStream.<init>(ObjectInputStream.java:493)
              at hudson.remoting.ObjectInputStreamEx.<init>(ObjectInputStreamEx.java:49)
              at hudson.remoting.Command.readFrom(Command.java:142)
              at hudson.remoting.Command.readFrom(Command.java:128)
              at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:35)
              at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:61)
      Caused: java.io.IOException: Unexpected termination of the channel
              at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:75)
      
      2023-06-09 02:46:04.923+0000 [id=4711]  INFO    j.p.o.compute.JCloudsComputer#deleteSlave: Deleting slave Ubuntu 22.04 LTS-47 after executing 1 builds
      2023-06-09 02:46:05.052+0000 [id=4711]  INFO    j.p.o.compute.JCloudsComputer#deleteSlave: Deleted slave Ubuntu 22.04 LTS-47
      2023-06-09 02:46:06.888+0000 [id=4711]  INFO    j.p.o.c.JCloudsCleanupThread#destroyServersOutOfScope: Server Ubuntu 22.04 LTS-47 run out of its scope node:Ubuntu 22.04 LTS-47:-1606364798. Terminating: NovaServer{id=<id>, name=Ubuntu 22.04 LTS-47, image={id=<id>, links=[{href=<openstack-url>, rel=bookmark}]}, flavor=NovaFlavor{id=<id>, ephemeral=0, swap=0, rxtx_factor=1.0, links=[GenericLink{href=<openstack-url>, rel=bookmark}],
      }, status=ACTIVE, diskconfig=MANUAL, userId=<userID>, created=Fri Jun 09 02:26:05 UTC 2023, updated=Fri Jun 09 02:26:18 UTC 2023, launched at=Fri Jun 09 02:26:18 UTC 2023, tenantId=<tenantID>, hostId=<hostID>, addresses=NovaAddresses{addresses={HQ-LAN=[NovaAddress{address=<ip-addr>, type=fixed, version=4, macaddr=<macAddr>,
      }]},
      }, hypervisor host=os-berserker, powerstate=1, instanceName=instance-0001f1b7, vmState=active, metadata={jenkins-boot-image-id=015d6c77-f8a3-4542-9910-478997552421, jenkins-cloud-name=openstack, jenkins-template-name=Ubuntu 22.04 LTS, jenkins-instance=<jenkins-url>, jenkins-identity=<jenkins-identity>, jenkins-scope=node:Ubuntu 22.04 LTS-47:-1606364798, jenkins-boot-source=Image ubuntu-22.04-20220114}}
      

      From config.xml:

                  <name>Ubuntu 22.04 LTS</name>
                  <labelString>ub2204-ostack-zst ub2204</labelString>
                  <slaveOptions>
                    <bootSource class="jenkins.plugins.openstack.compute.slaveopts.BootSource$Image">
                      <name>ubuntu-22.04-20220114</name>
                    </bootSource>
                    <hardwareId>1f1b21d3-42ba-46d0-800b-43904e1d8d15</hardwareId>
                    <userDataId>ostack-ub2204-init</userDataId>
                    <instanceCap>20</instanceCap>
                    <fsRoot>/home/ubuntu</fsRoot>
                    <launcherFactory class="jenkins.plugins.openstack.compute.slaveopts.LauncherFactory$SSH">
                      <credentialsId>ubuntu</credentialsId>
                    </launcherFactory>
                  </slaveOptions>
                </jenkins.plugins.openstack.compute.JCloudsSlaveTemplate>
                <jenkins.plugins.openstack.compute.JCloudsSlaveTemplate>
                  <name>Ubuntu 20.04 LTS</name>
                  <labelString>ub2004-ostack-zst ub2004</labelString>
                  <slaveOptions>
                    <bootSource class="jenkins.plugins.openstack.compute.slaveopts.BootSource$VolumeFromImage"> 

          [JENKINS-71420] openstack-cloud-plugin deletes the node when the node goes for a reboot

          From the plugin perspective, this is indistinguishable from VM deletion or connection breakdown, hence it is being deleted, so it does not sit around for too long.

          I find it a bad idea to restart VM from jenkins jobs. Better adjust the provisioning, so the VM is in correct state when it boots.

          Oliver Gondža added a comment - From the plugin perspective, this is indistinguishable from VM deletion or connection breakdown, hence it is being deleted, so it does not sit around for too long. I find it a bad idea to restart VM from jenkins jobs. Better adjust the provisioning, so the VM is in correct state when it boots.

          Avaneesh added a comment -

          Before starting the tests, I need to install the kernel which is part of the pipeline. To boot the kernel I need to reboot. The issue is not found when ec2 plugin is used. It would be helpful to me if openstack plugin too behaves the same way.

          Avaneesh added a comment - Before starting the tests, I need to install the kernel which is part of the pipeline. To boot the kernel I need to reboot. The issue is not found when ec2 plugin is used. It would be helpful to me if openstack plugin too behaves the same way.

          I do not know how AWS is doing this, but it is clunky to speculatively wait for VM to come back up. In the majority of cases, it is a symptom of problems that will not recover. So changing the plugin to wait might help you, but it would prevent early collection of resources other users depend on.

          Most elegant way to go would be to resurrect https://issues.jenkins.io/browse/JENKINS-47742, that would permit you to build the kernel on on type of agent, and then install it on another and continue build there.

          An alternative might be to

          • build image snapshot with the kernel in (create machine not connected as agent, install what you need there, make a snapshot out of it)
          • have a openstack template declared to boot from snapshots of the name you build -^
          • have the pipeline running on 2 agents, using the snapshot consuming template in the other half.

          Oliver Gondža added a comment - I do not know how AWS is doing this, but it is clunky to speculatively wait for VM to come back up. In the majority of cases, it is a symptom of problems that will not recover. So changing the plugin to wait might help you, but it would prevent early collection of resources other users depend on. Most elegant way to go would be to resurrect https://issues.jenkins.io/browse/JENKINS-47742 , that would permit you to build the kernel on on type of agent, and then install it on another and continue build there. An alternative might be to build image snapshot with the kernel in (create machine not connected as agent, install what you need there, make a snapshot out of it) have a openstack template declared to boot from snapshots of the name you build -^ have the pipeline running on 2 agents, using the snapshot consuming template in the other half.

            olivergondza Oliver Gondža
            anianna Avaneesh
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated: