-
Bug
-
Resolution: Unresolved
-
Major
-
Jenkins Version: v2.222.1 LTS (used the image from Docker hub: jenkins/jenkins:2.222.1)
Jenkins Kubernetes plugin version: 1.25.7
Kubernetes version: 1.18.3
We have a K8s setup running Jenkins master in one of the deployments. We have configured the Jenkins talk to run dynamic agents.
During an upgrade/restart of Jenkins Server if there is a job running in a pod, if the Jenkins server pod is restarted (by deleting the pod kubernetes will start the pod again), the job is resuming after the server is fully up and running.
If due to some reason, the Jenkins server pod is restarted 2 or more times (while a pipeline job is running in a agent pod) before the Jenkins container is fully up (restarting either manully or by other means like Kubernetes kills and starts the pod again, etc.,) the running job hangs and fails to resume. with below error
ERROR: Issue with creating launcher for agent k8s-pod-label-xxxxx-yyyyy. The agent has not been fully initialized yet
This issue is seen on all kinds of pods
How to duplicate this issue:
- Setup K8s and create a deployment to run Jenkins. This will run a pod that provisions Jenkins. Setup Jenkins to use Kubernetes agents.
- Create a pipeline Job, that runs on Kubernetes agents and trigger the job and let it start the pod and run the stages and steps
- While the pipeline step is executing, delete the pod that runs the server.
- K8s will start a new pod to provision Jenkins. Delete the newly started pod also.
- Wait for K8s to complete provisioning the pod again and check the pipeline job's console and observe the job is hung forever. It is expected to resume the job and complete.
In my case, the jenkins pod is restarted multiple times by K8s due to OOM error or other reasons.
The test pipeline job has the following definition
apiVersion: v1 kind: Pod metadata: name: test spec: containers: - name: ubuntu image: 'debian:buster-slim' imagePullPolicy: Always command: ['cat'] tty: true resources: requests: cpu: 200m memory: 500Mi limits: cpu: 500m memory: 1Gi
pipeline { agent { kubernetes { label "test-container" yamlFile 'cloudprovider.yaml' defaultContainer 'ubuntu' } } stages { stage ('Build') { steps { sh ''' echo "Start" sleep 100 ls -lah echo "End" ''' } } } }