Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-65945

After Jenkins restart kubernetes agents cannot be provisioned

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Major Major
    • kubernetes-plugin
    • None
    • Jenkins 2.298
      kubernetes 1.30.0
      kubernetes-client-api 5.4.1

      Short version:

      After restarting a Jenkins instance it is unable to provision Kubernetes based agents anymore; after restarting it again it is able to do so again.

      Long version:

      We've seen this happening a couple of weeks ago, and now again. It happens to a number of Jenkinses we run, one of them having the versions in the "environment" fields, some a few versions older. We have many other Jenkinses too that don't have this issue.

      Every Saturday we restart the Jenkins instances. Before that Kubernetes based dynamic agents were working correctly. After the Saturday restart they stopped working, builds weren't able to start new agents. So now (Monday) we restarted Jenkins and now they work perfectly again.

      Before the restart I tried to recreate the cluster config from scratch, but it didn't fix it.

      Clicking the "Test connection" button in the cloud config responded with success.

      The Kubernetes cluster is otherwise healthy, it is happily running pods of other systems.

      Related output of failing builds:

      [Pipeline] Start of Pipeline
      [Pipeline] echo
      Bringing up containers [jnlp:[ttyEnabled:false, image:jenkins-jnlp-slave:linux, alwaysPullImage:true, resourceRequestCpu:0.5, resourceLimitCpu:2, resourceRequestMemory:512Mi, resourceLimitMemory:2Gi]]
      [Pipeline] echo
      This is the overriden podTemplate, to collect slave info to Grafeas
      [Pipeline] podTemplate
      [Pipeline] {
      [Pipeline] withEnv
      [Pipeline] {
      [Pipeline] echo
      This is the overriden node jenkins-istvans-test-5, to collect slave hostname to Grafeas
      [Pipeline] nodeStill waiting to schedule task
      All nodes of label ‘jenkins-istvans-test-5’ are offline
      
      (...it is hanging here, nothing else is happening...)
      

      System logs (org.csanchez.jenkins.plugins.kubernetes = ALL)

      Jun 21, 2021 12:29:50 PM FINE org.csanchez.jenkins.plugins.kubernetes.KubernetesCloud provision
      Label "jenkins-istvans-test-5" excess workload: 1, executors: 0
      Jun 21, 2021 12:29:50 PM FINE org.csanchez.jenkins.plugins.kubernetes.KubernetesCloud provision
      Template for label "jenkins-istvans-test-5": jenkins-istvans-test-5-p61lh
      Jun 21, 2021 12:29:50 PM FINE org.csanchez.jenkins.plugins.kubernetes.KubernetesCloud provision
      In provisioning : []
      Jun 21, 2021 12:29:50 PM FINE org.csanchez.jenkins.plugins.kubernetes.KubernetesCloud provision
      Label "jenkins-istvans-test-5" excess workload: 1, executors: 0
      Jun 21, 2021 12:29:50 PM FINE org.csanchez.jenkins.plugins.kubernetes.KubernetesCloud provision
      Template for label "jenkins-istvans-test-5": jenkins-istvans-test-5-p61lh
      Jun 21, 2021 12:29:50 PM FINE org.csanchez.jenkins.plugins.kubernetes.KubernetesCloud provision
      In provisioning : []
      
      (...nothing else related to the job...)

      I'm not really sure what other info I should give, please let me know if there is anything else I can gather when it happens again.

       

       

            Unassigned Unassigned
            pistahh Istvan Szekeres
            Votes:
            2 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated: