Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-53683

Parallel pod-provisioning fails for distinct pod templates

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Incomplete
    • Icon: Major Major
    • kubernetes-plugin
    • None
    • Openshift Master: 3.9.0
      Kubernetes Master: 1.9.1
      Jenkins: 2.138.1
      Kubernetes Plugin: 1.12.6

      When we start two jenkins jobs simultaneously which make use of different pod templates the pod-provisioning fails.
      Openshift tries to create the pods, but either they are terminated immediately or they end up in error state with all containers running except jnlp (ConnectionRefusalException: Unknown client name).
      This happens continuously until the container cap is reached or until one of the two builds in the waiting queue is aborted manually.
      Everything works fine if the two jobs use the same pod-template, we only observe this behavior for different pod templates.
      The attached logs demonstrate the behavior:
      The two pods, which are provisioned first are:

      • docker-4h6km (kubernetes pod template name "docker")
      • jenkinsslave-rzk3k-wr4wz (kubernetes pod template name "jenkinsslave")

      The former is configured on the jenkins-UI global configuration, the latter is configured directly in the pipeline-script.
      The pod-templates have different names and labels but contain both a container named "docker".
      Strangely, in the jenkins-log we observe a KubernetesSlave _terminate on the docker-pod, directly after the provisioning:

      INFO: Waiting for Pod to be scheduled (1/100): docker-4h6km
      Sep 20, 2018 12:14:28 PM hudson.slaves.NodeProvisioner$2 run
      INFO: Kubernetes Pod Template provisioning successfully completed. We have now 3 computer(s)
      Sep 20, 2018 12:14:28 PM org.csanchez.jenkins.plugins.kubernetes.KubernetesSlave _terminate
      INFO: Terminating Kubernetes instance for agent docker-4h6km
      Sep 20, 2018 12:14:28 PM okhttp3.internal.platform.Platform log
      

      The subsequent errors may be caused by this early termination of the pod?

      attached files:

      • jenkins_log.txt - log of the jenkins-master
      • failure_jnlp_agent.txt - log of the jnlp-agent in one of the failed pods
      • kubernetes_pod_template_docker.png - configuration of the docker-pod on jenkins UI
      • kubernetes_pod_template_docker_volumes.png - volume configuration of the docker-pod on the jenkins UI

      We have two kubernetes-masters + four nodes

        1. failure_jnlp_agent.txt
          5 kB
          Fabian Braun
        2. jenkins_log.txt
          407 kB
          Fabian Braun
        3. kubernetes_cloud_config.PNG
          23 kB
          Fabian Braun
        4. kubernetes_plugin_log.txt
          94 kB
          Fabian Braun
        5. kubernetes_pod_template_docker_volumes.PNG
          8 kB
          Fabian Braun
        6. kubernetes_pod_template_docker.PNG
          24 kB
          Fabian Braun

            Unassigned Unassigned
            fabian_braun Fabian Braun
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: