-
Bug
-
Resolution: Incomplete
-
Minor
-
None
We have a custom large jnlp container which we build ourselves (around 4gb) that we use in our pod template, I noticed lot of the time when the build triggers a node scale up the container will fail the same way as below with error Exit Code: 143, Reason: Error
I am not sure how else to troubleshoot this, as I don't see anything relevant in the k8 api logs or jenkins when this happens. I presume this is happening due to the large jnlp container? Is there a way we can help prevent this failing or any help would be appreciated
thank you
Log
Created Pod: kubernetes jenkins/development-cloud-40-d-3dkk3
[Warning][jenkins/development-cloud-40-d-3dkk3][FailedScheduling] 0/7 nodes are available: 1 node(s) didn't match Pod's node affinity/selector, 1 node(s) were unschedulable, 5 Insufficient cpu.
Still waiting to schedule task
‘development-cloud-40-d-3dkk3’ is offline
[Normal][jenkins/development-cloud-40-d-3dkk3][TriggeredScaleUp] pod triggered scale-up: [\{eks-managed-spot-20220520172828856700000007-c2c07141 5->6 (max: 12)}]
[Warning][jenkins/development-cloud-40-d-3dkk3][FailedScheduling] 0/7 nodes are available: 1 node(s) didn't match Pod's node affinity/selector, 1 node(s) were unschedulable, 5 Insufficient cpu.
[Warning][jenkins/development-cloud-40-d-3dkk3][FailedScheduling] 0/7 nodes are available: 1 node(s) didn't match Pod's node affinity/selector, 1 node(s) had taint {node.kubernetes.io/not-ready: }, that the pod didn't tolerate, 5 Insufficient cpu.
[Warning][jenkins/development-cloud-40-d-3dkk3][FailedScheduling] 0/7 nodes are available: 1 node(s) didn't match Pod's node affinity/selector, 1 node(s) had taint {node.kubernetes.io/not-ready: }, that the pod didn't tolerate, 5 Insufficient cpu.
[Normal][jenkins/development-cloud-40-d-3dkk3][Scheduled] Successfully assigned jenkins/development-cloud-40-d-3dkk3 to ip-x-x-x-x.ec2.internal
[Normal][jenkins/development-cloud-40-d-3dkk3][Pulling] Pulling image "xyz:latest"
[Normal][jenkins/development-cloud-40-d-3dkk3][TaintManagerEviction] Cancelling deletion of Pod jenkins/development-cloud-40-d-3dkk3
[Normal][jenkins/development-cloud-40-d-3dkk3][Pulled] Successfully pulled image "xyz:latest" in 47.139135874s
[Normal][jenkins/development-cloud-40-d-3dkk3][Created] Created container jnlp
[Normal][jenkins/development-cloud-40-d-3dkk3][Started] Started container jnlp
[Normal][jenkins/development-cloud-40-d-3dkk3][Pulling] Pulling image "abc:latest"
jenkins/development-cloud-40-d-3dkk3 Container jnlp was terminated (Exit Code: 143, Reason: Error)
[Pipeline] // node
[Pipeline] }
[Pipeline] // podTemplate
[Pipeline] }
[Pipeline] // timeout
[Pipeline] End of Pipeline