Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-67167

in a kubernetes pod sh steps inside container() are failing sporadically

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Critical Critical
    • kubernetes-plugin
    • Jenkins 2.303.3
      Kubernetes plugin 1.30.6
      Durable Task Plugin: 1.39
      jnlp via jenkins/inbound-agent:4.11-1-alpine-jdk8

      Issue is reproducible using the attached pipeline: jnlpcontainer_tests.groovy

      Description of the test:

      • running inside a k8s pod, with multiple containers
        • a jnlp container
        • a build container
      • the pipeline starts 3 parallel branches
        • jnlp branch - runs sh inside container('jnlp'){}
        • build branch - runs sh inside container('build'){}  // this is how the second container in the pod is called 
        • noContainer() branch  – runs sh outside any container(){} closure
      • in each of the parallel branches a simple sh call is executed
      • in the jnlp and build branches sh is called inside a container() closure
        • in these 2 branches sh is failing sporadically
      • in the noContainer branch sh is called not inside a container() closure
        • not a single failure was noticed in this branch in all the tries I started

      mainly 2 Exceptions were thrown

      [2021-11-18T10:49:57.920Z] java.io.EOFException
      [2021-11-18T10:49:57.921Z] 	at okio.RealBufferedSource.require(RealBufferedSource.java:61)
      [2021-11-18T10:49:57.921Z] 	at okio.RealBufferedSource.readByte(RealBufferedSource.java:74)
      [2021-11-18T10:49:57.921Z] 	at okhttp3.internal.ws.WebSocketReader.readHeader(WebSocketReader.java:117)
      [2021-11-18T10:49:57.921Z] 	at okhttp3.internal.ws.WebSocketReader.processNextFrame(WebSocketReader.java:101)
      [2021-11-18T10:49:57.921Z] 	at okhttp3.internal.ws.RealWebSocket.loopReader(RealWebSocket.java:274)
      [2021-11-18T10:49:57.921Z] 	at okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:214)
      [2021-11-18T10:49:57.921Z] 	at okhttp3.RealCall$AsyncCall.execute(RealCall.java:203)
      [2021-11-18T10:49:57.921Z] 	at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32)
      [2021-11-18T10:49:57.921Z] 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
      [2021-11-18T10:49:57.921Z] 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
      [2021-11-18T10:49:57.921Z] 	at java.lang.Thread.run(Thread.java:748)
      [2021-11-18T10:49:57.921Z] ERROR: Process exited immediately after creation. See output below
      [2021-11-18T10:49:57.921Z] Executing sh script inside container jnlp of pod test-multiplecontainers-in-node-5d914e4e-3023-4bf0-845d-2-pcxs5
      [2021-11-18T10:49:57.921Z] 
      Process exited immediately after creation. Check logs above for more details.
      

      and

      [2021-11-18T10:49:58.203Z] java.net.ProtocolException: Expected HTTP 101 response but was '500 Internal Server Error'
      [2021-11-18T10:49:58.205Z] 	at okhttp3.internal.ws.RealWebSocket.checkResponse(RealWebSocket.java:229)
      [2021-11-18T10:49:58.205Z] 	at okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:196)
      [2021-11-18T10:49:58.205Z] 	at okhttp3.RealCall$AsyncCall.execute(RealCall.java:203)
      [2021-11-18T10:49:58.205Z] 	at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32)
      [2021-11-18T10:49:58.205Z] 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
      [2021-11-18T10:49:58.205Z] 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
      [2021-11-18T10:49:58.205Z] 	at java.lang.Thread.run(Thread.java:748)
      io.fabric8.kubernetes.client.KubernetesClientException: error dialing backend: dial tcp 192.168.3.11:10250: connect: connection refused
      
      • NOTE: the test consists of 100 iteration for each branch, all executed in the same Agent pod. so if we get a KubernetesClientException with a connect refused error if retry again on the same container it will eventually work again
        see:

            Unassigned Unassigned
            ysmaoui Yacine
            Votes:
            3 Vote for this issue
            Watchers:
            19 Start watching this issue

              Created:
              Updated: