• Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Critical Critical
    • kubernetes-plugin
    • None
    • - 3 node Rancher 2.6.2 cluster on OpenSUSe 15.3 Leap, kernel 5.3.18-59.34-default for workloads
      - single node k3s cluster Rancher 2.6.2 for management
      - Jenkins 2.303.3, Java 11
      - Kubernetes plugin 1.30.11

      After about six hours, workload fails with pods in state "not ready" with following logs in Jenkins, until failure it runs repetedly without issues:

      java.net.ProtocolException: Expected HTTP 101 response but was '502 Bad Gateway'
      
      	at okhttp3.internal.ws.RealWebSocket.checkResponse(RealWebSocket.java:229)
      
      	at okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:196)
      
      	at okhttp3.RealCall$AsyncCall.execute(RealCall.java:203)
      
      	at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32)
      
      	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
      
      	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
      
      	at java.base/java.lang.Thread.run(Thread.java:829)
      
      ERROR: Process exited immediately after creation. See output below
      
      Executing sh script inside container jnks-fio-inside-mount-container of pod patamat-k8s-nfsv4-1-fio-inside-mount-opensuse-579-dnlbw-f-hrf8t
      
      
      
      Process exited immediately after creation. Check logs above for more details.
      

      and following logs in rancher on jnlp container:

      Warning: SECRET is defined twice in command-line arguments and the environment variable
      Warning: AGENT_NAME is defined twice in command-line arguments and the environment variable
      Dec 01, 2021 2:52:05 PM hudson.remoting.jnlp.Main createEngine
      INFO: Setting up agent: server-k8s-nfsv4-1-fio-inside-mount-opensuse-506-kdbmg-7h6-g27np
      Dec 01, 2021 2:52:05 PM hudson.remoting.jnlp.Main$CuiListener <init>
      INFO: Jenkins agent is running in headless mode.
      Dec 01, 2021 2:52:05 PM hudson.remoting.Engine startEngine
      INFO: Using Remoting version: 4.11
      Dec 01, 2021 2:52:05 PM org.jenkinsci.remoting.engine.WorkDirManager initializeWorkDir
      INFO: Using /home/jenkins/agent/remoting as a remoting work directory
      Dec 01, 2021 2:52:05 PM org.jenkinsci.remoting.engine.WorkDirManager setupLogging
      INFO: Both error and output logs will be printed to /home/jenkins/agent/remoting
      Dec 01, 2021 2:52:05 PM hudson.remoting.jnlp.Main$CuiListener status
      INFO: WebSocket connection open
      Dec 01, 2021 2:52:05 PM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Connected
      Dec 01, 2021 2:56:16 PM org.csanchez.jenkins.plugins.kubernetes.KubernetesSlave$SlaveDisconnector call
      INFO: Disabled agent engine reconnects.
      Dec 01, 2021 2:56:20 PM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Write side closed
      Dec 01, 2021 2:56:20 PM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Read side closed
      Dec 01, 2021 2:56:20 PM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Terminated
      Dec 01, 2021 2:56:20 PM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Read side closed
      

          [JENKINS-67269] Kubernetes workload fails after few hours

          bga added a comment -

          I have tried multiple versions of agent and rancher 2.6.0 as well with the same result

          bga added a comment - I have tried multiple versions of agent and rancher 2.6.0 as well with the same result

          bga added a comment -
          • running following pod config:
          pipeline {
              agent {
                  kubernetes {
                      yaml '''
          apiVersion: v1
          kind: Pod
          metadata:
           name: jnks-fio-inside-mount-pod
           labels:
             app: jnks-fio-inside-mount
          spec:
           containers:
            - name: jnlp
              image: 'jenkins/inbound-agent:4.11-1-jdk11'
              args: ['\$(JENKINS_SECRET)', '\$(JENKINS_NAME)']
            - name: jnks-fio-inside-mount-container
              image: myimage
              command:
               - sleep
              args:
               - infinity
              securityContext:
               privileged: true
          '''
                      // Can also wrap individual steps:
                      // container('shell') {
                      //     sh 'hostname'
                      // }
                      defaultContainer 'jnks-fio-inside-mount-container'
                  }
              }
              stages {
                  stage ('Mount share') {
                      steps{
                          sh 'mount -vvv -o vers=4.1 -o nconnect=16 server:/fio-jnks-opensuse-inside-1 /mnt'
                      }
                  }
                  stage ('Run FIO') {
                      steps{
                          sh 'fio --verify=crc32c --buffer_compress_percentage=0 --buffer_compress_chunk=0 --size=10G --directory=/mnt --name=fio --ioengine=libaio  --fallocate=none --iodepth=2 --rw=write --bs=512k --direct=1 --numjobs=8 --nrfiles=8 --runtime=600 --group_reporting'
                      }
                  }
                  stage('Delete files') {
                      steps {
                          sh 'rm -rf /mnt/fio*'
                      }
                  }
                      
                  }
              post {
                  // Send the build result to slack channel
                  unsuccessful {
                    slackSend botUser: true, channel: '#general', color: 'danger', message: "${env.JOB_NAME} Unsuccessful build", teamDomain: 'xxx', tokenCredentialId: 'slack-bot'
                  }
              }
          triggers { upstream(upstreamProjects: "${env.JOB_NAME}", threshold: hudson.model.Result.SUCCESS) }
          }
          

          bga added a comment - running following pod config: pipeline { agent { kubernetes { yaml ''' apiVersion: v1 kind: Pod metadata: name: jnks-fio-inside-mount-pod labels: app: jnks-fio-inside-mount spec: containers: - name: jnlp image: 'jenkins/inbound-agent:4.11-1-jdk11' args: ['\$(JENKINS_SECRET)', '\$(JENKINS_NAME)'] - name: jnks-fio-inside-mount-container image: myimage command: - sleep args: - infinity securityContext: privileged: true ''' // Can also wrap individual steps: // container('shell') { // sh 'hostname' // } defaultContainer 'jnks-fio-inside-mount-container' } } stages { stage ('Mount share') { steps{ sh 'mount -vvv -o vers=4.1 -o nconnect=16 server:/fio-jnks-opensuse-inside-1 /mnt' } } stage ('Run FIO') { steps{ sh 'fio --verify=crc32c --buffer_compress_percentage=0 --buffer_compress_chunk=0 --size=10G --directory=/mnt --name=fio --ioengine=libaio --fallocate=none --iodepth=2 --rw=write --bs=512k --direct=1 --numjobs=8 --nrfiles=8 --runtime=600 --group_reporting' } } stage('Delete files') { steps { sh 'rm -rf /mnt/fio*' } } } post { // Send the build result to slack channel unsuccessful { slackSend botUser: true, channel: '#general', color: 'danger', message: "${env.JOB_NAME} Unsuccessful build", teamDomain: 'xxx', tokenCredentialId: 'slack-bot' } } triggers { upstream(upstreamProjects: "${env.JOB_NAME}", threshold: hudson.model.Result.SUCCESS) } }

          bga added a comment - - edited

          UPDATE:
          It seems that when pod is created but is in status "not ready", Jenkins creates new pods which are running and the pod causing the Jenkis fail is not the one which has "not ready" status, they god different names. So it seems that Jenkins is failing with

          java.net.ProtocolException: Expected HTTP 101 response but was '500 Internal Server Error'
          

          during creating pod

          bga added a comment - - edited UPDATE: It seems that when pod is created but is in status "not ready", Jenkins creates new pods which are running and the pod causing the Jenkis fail is not the one which has "not ready" status, they god different names. So it seems that Jenkins is failing with java.net.ProtocolException: Expected HTTP 101 response but was '500 Internal Server Error' during creating pod

            Unassigned Unassigned
            bganicky bga
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated: