• Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Major Major
    • kubernetes-plugin
    • None
    • Jenkins 2.414.1
      kubernetes plugin 4029.v5712230ccb_f8

      After latest upgrade of jenkins itself (latest lts from previous version of lts) and kubernetes plugin I noticed that it now creates 2 pods per job run.

      It started happening just recently but I cannot tell what component upgrade exactly has caused this.

      I create a pod using a single `podTemplate` function.

      This is how jenkins logs look in the UI:

      00:00:07.660  Created Pod: kubernetes adm-prod-jenkins-agents/jenkins-slave-d2e6bd35-409a-48e5-8b13-bf255719d5c2-zxfx1-7ws76
      00:00:21.526  Still waiting to schedule task
      00:00:21.528  Waiting for next available executor on ‘jenkins-slave-d2e6bd35-409a-48e5-8b13-bf255719d5c2-zxfx1-7ws76’
      00:00:38.214  Agent jenkins-slave-d2e6bd35-409a-48e5-8b13-bf255719d5c2-zxfx1-7ws76 is provisioned from template jenkins-slave-d2e6bd35-409a-48e5-8b13-bf255719d5c2-zxfx1
      00:00:38.237  ---
      00:00:38.237  apiVersion: "v1"
      00:00:38.237  kind: "Pod"
      00:00:38.237  metadata:
      00:00:38.237  <redacted>
      00:00:38.238  
      00:00:38.347  Created Pod: kubernetes adm-prod-jenkins-agents/jenkins-slave-d2e6bd35-409a-48e5-8b13-bf255719d5c2-zxfx1-dr2nw
      00:00:38.679  Running on jenkins-slave-d2e6bd35-409a-48e5-8b13-bf255719d5c2-zxfx1-7ws76 in /home/jenkins/agent/workspace/apps--certly-client_v2
      

      See that the prefix is the same and only last 5 characters differ.

      And this is how stdout of jenkins server looks

      2023-09-05 03:43:52.304+0000 [id=416]    INFO    hudson.slaves.NodeProvisioner#update: jenkins-slave-d2e6bd35-409a-48e5-8b13-bf255719d5c2-zxfx1-7ws76 provisioning successfully completed. We have now 2 computer(s)
      2023-09-05 03:43:52.383+0000 [id=418]    INFO    o.c.j.p.k.KubernetesLauncher#launch: Created Pod: kubernetes adm-prod-jenkins-agents/jenkins-slave-d2e6bd35-409a-48e5-8b13-bf255719d5c2-zxfx1-7ws76
      2023-09-05 03:43:57.188+0000 [id=974]    INFO    h.TcpSlaveAgentListener$ConnectionHandler#run: Accepted JNLP4-connect connection #7 from /10.51.1.43:58702
      2023-09-05 03:44:22.869+0000 [id=418]    INFO    o.c.j.p.k.KubernetesLauncher#launch: Pod is running: kubernetes adm-prod-jenkins-agents/jenkins-slave-d2e6bd35-409a-48e5-8b13-bf255719d5c2-zxfx1-7ws76
      2023-09-05 03:44:23.007+0000 [id=47]    INFO    hudson.slaves.NodeProvisioner#update: jenkins-slave-d2e6bd35-409a-48e5-8b13-bf255719d5c2-zxfx1-dr2nw provisioning successfully completed. We have now 3 computer(s)
      2023-09-05 03:44:23.070+0000 [id=418]    INFO    o.c.j.p.k.KubernetesLauncher#launch: Created Pod: kubernetes adm-prod-jenkins-agents/jenkins-slave-d2e6bd35-409a-48e5-8b13-bf255719d5c2-zxfx1-dr2nw
      2023-09-05 03:44:27.955+0000 [id=1029]    INFO    h.TcpSlaveAgentListener$ConnectionHandler#run: Accepted JNLP4-connect connection #8 from /10.51.1.46:42336
      ...
      
      2023-09-05 03:47:02.719+0000 [id=976]    INFO    j.s.DefaultJnlpSlaveReceiver#channelClosed: Computer.threadPoolForRemoting [#23] for jenkins-slave-d2e6bd35-409a-48e5-8b13-bf255719d5c2-zxfx1-7ws76 terminated: java.nio.channels.ClosedChannelException 
      2023-09-05 03:50:42.663+0000 [id=1549]    INFO    o.c.j.p.k.KubernetesSlave#_terminate: Terminating Kubernetes instance for agent jenkins-slave-d2e6bd35-409a-48e5-8b13-bf255719d5c2-zxfx1-dr2nw                                                                                          
      2023-09-05 03:50:42.886+0000 [id=1549]    INFO    o.c.j.p.k.KubernetesSlave#deleteSlavePod: Terminated Kubernetes instance for agent adm-prod-jenkins-agents/jenkins-slave-d2e6bd35-409a-48e5-8b13-bf255719d5c2-zxfx1-dr2nw                                                               
      2023-09-05 03:50:42.888+0000 [id=1549]    INFO    o.c.j.p.k.KubernetesSlave#_terminate: Disconnected computer jenkins-slave-d2e6bd35-409a-48e5-8b13-bf255719d5c2-zxfx1-dr2nw                                                                                                              
      2023-09-05 03:50:42.889+0000 [id=1549]    INFO    j.s.DefaultJnlpSlaveReceiver#channelClosed: Computer.threadPoolForRemoting [#33] for jenkins-slave-d2e6bd35-409a-48e5-8b13-bf255719d5c2-zxfx1-dr2nw terminated: java.nio.channels.ClosedChannelException 
      

      So one of the pods exits immediately right after the job has completed, and the other stays there til the job timeout.

          [JENKINS-71967] kubernetes-plugin creates 2 pods per run

          Steven added a comment -

          I can confirm this issue with Jenkins 2.422, occurring after upgrading the plugin from 3995.v227c16b_675ee to 4029.v5712230ccb_f8. Maybe this helps identifying the issue.

          Steven added a comment - I can confirm this issue with Jenkins 2.422, occurring after upgrading the plugin from 3995.v227c16b_675ee to 4029.v5712230ccb_f8 . Maybe this helps identifying the issue.

          John added a comment - - edited

          Same with Jenkins 2.401.3, after updating to 4029.v5712230ccb_f8, we have been experiencing pod duplication. I don't necessarily agree that this issue should be considered a minor priority because it drastically affects cloud costs if you intend to use dynamic node provisioning.


          UPD: I've done testing and can now confirm that the cause of this behavior is version 4007.v633279962016 of the plugin.

          I've performed a proper rollback, including the plugin dependencies, and the first time this behavior occurred was with this plugin configuration:

          • snakeyaml-api: 1.33-95.va_b_a_e3e47b_fa_4
          • kubernetes-client-api: 6.4.1-215.v2ed17097a_8e9
          • kubernetes-credentials: 0.10.0
          • kubernetes: 4007.v633279962016

          The bug is still present in version 4029.v5712230ccb_f8

          The working plugin configuration is as follows:

          • snakeyaml-api: 1.33-95.va_b_a_e3e47b_fa_4
          • kubernetes-client-api: 6.4.1-215.v2ed17097a_8e9
          • kubernetes-credentials: 0.10.0
          • kubernetes: 3995.v227c16b_675ee

          If you've updated the configuration-as-code plugin, it also needs to be downgraded to 1670.v564dc8b_982d0 due to snakeyaml's breaking update, as well as any other plugin that has snakeyaml-api dependancy.

          John added a comment - - edited Same with Jenkins 2.401.3, after updating to 4029.v5712230ccb_f8, we have been experiencing pod duplication. I don't necessarily agree that this issue should be considered a minor priority because it drastically affects cloud costs if you intend to use dynamic node provisioning. UPD : I've done testing and can now confirm that the cause of this behavior is version 4007.v633279962016 of the plugin. I've performed a proper rollback, including the plugin dependencies, and the first time this behavior occurred was with this plugin configuration: snakeyaml-api: 1.33-95.va_b_a_e3e47b_fa_4 kubernetes-client-api: 6.4.1-215.v2ed17097a_8e9 kubernetes-credentials: 0.10.0 kubernetes: 4007.v633279962016 The bug is still present in version 4029.v5712230ccb_f8 The working plugin configuration is as follows: snakeyaml-api: 1.33-95.va_b_a_e3e47b_fa_4 kubernetes-client-api: 6.4.1-215.v2ed17097a_8e9 kubernetes-credentials: 0.10.0 kubernetes: 3995.v227c16b_675ee If you've updated the configuration-as-code plugin, it also needs to be downgraded to 1670.v564dc8b_982d0 due to snakeyaml's breaking update, as well as any other plugin that has snakeyaml-api dependancy.

          Robert added a comment - - edited

          I can confirm :

          Jenkins : v2.414.1

          Kubernetes plugin : 4054.v2da_8e2794884

          It affects cloud costs, all pipelines run and we get the expected result but I estimate a price increase of 50%

          Robert added a comment - - edited I can confirm : Jenkins : v2.414.1 Kubernetes plugin : 4054.v2da_8e2794884 It affects cloud costs, all pipelines run and we get the expected result but I estimate a price increase of 50%

          Gao added a comment -

          I can confirm :

          Jenkins : v2.414.1 or Jenkins 2.387.3

          Kubernetes plugin :4029.v5712230ccb_f8
           

          Gao added a comment - I can confirm : Jenkins : v2.414.1 or Jenkins 2.387.3 Kubernetes plugin : 4029.v5712230ccb_f8  

          Robert added a comment -

          I have another observation :

          We have relative complex jobs involving many side-carts which load databases etc. Al these jobs start twice,

          However, we have one very simple job and that job only is started once.

          Can anyone point me to something I could look at so we can sharpen our analysis

           

          Robert added a comment - I have another observation : We have relative complex jobs involving many side-carts which load databases etc. Al these jobs start twice, However, we have one very simple job and that job only is started once. Can anyone point me to something I could look at so we can sharpen our analysis  

          My observation is that if there is kubernetes node existing for agent pod, additional pod is not created. If node is scaled up during pod creation, additional pod is created and also then node for it.

          Petri Airaksinen added a comment - My observation is that if there is kubernetes node existing for agent pod, additional pod is not created. If node is scaled up during pod creation, additional pod is created and also then node for it.

          Robert added a comment - - edited

          I can confirm the above :

          1. When the autoscaler creates a new node
          2. A 2nd pod is created at the moment the health checks pass of the first pod
          3. The 2nd pod is destroyed on an unclear moment but often before the first pod ends

          If

          1. There is already a node on which the pod to schedule
          2. There is only one pod created on the existing node
          3. No additional pods are created

           

          Robert added a comment - - edited I can confirm the above : When the autoscaler creates a new node A 2nd pod is created at the moment the health checks pass of the first pod The 2nd pod is destroyed on an unclear moment but often before the first pod ends If There is already a node on which the pod to schedule There is only one pod created on the existing node No additional pods are created  

            Unassigned Unassigned
            zerkms Ivan Kurnosov
            Votes:
            9 Vote for this issue
            Watchers:
            10 Start watching this issue

              Created:
              Updated: