-
Improvement
-
Resolution: Unresolved
-
Minor
-
None
-
GKE cluster master and node pools version: 1.14
Cluster autoscaler activated
Jenkins master LTS installed with official Helm chart (1.1.24)
Kubernetes plugin: 1.19.0
I have a sporadic bug occuring on my Jenkins installation for months now:
java.net.ProtocolException: Expected HTTP 101 response but was '500 Internal Server Error' at okhttp3.internal.ws.RealWebSocket.checkResponse(RealWebSocket.java:229) at okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:196) at okhttp3.RealCall$AsyncCall.execute(RealCall.java:206) at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) io.fabric8.kubernetes.client.KubernetesClientException: error dialing backend: EOF
I believe it was already reported in these threads and I understand that this is caused by an HTTP 500 returned by the kubernetes API:
- https://issues.jenkins-ci.org/browse/JENKINS-39844
- https://stackoverflow.com/questions/50949718/kubernetes-gke-error-dialing-backend-eof-on-random-exec-command
However, after further investigation, I am sure now that the bug occurs only when the cluster autoscaler is on and more precisely when the autoscaler scales down while a Jenkins build is running. It maybe an edge case.
To fix this, I set the annotation on all my pods in the podTemplate yaml:
cluster-autoscaler.kubernetes.io/safe-to-evict: "false"
However, it didn't protect them. So I am trying now to setup a PodDisruptionBudget for each of my slave pod to protect them from eviction.
But, when passing the PDB into the podTemplate yaml it is just totally ignored. How can I protect my jenkins slave pods from eviction?
- is duplicated by
-
JENKINS-67167 in a kubernetes pod sh steps inside container() are failing sporadically
-
- Open
-
- relates to
-
JENKINS-64848 Shell step failing randomly
-
- Open
-
-
JENKINS-67474 Pipeline is failing due to io.fabric8.kubernetes.client.KubernetesClientException: not ready after n milliseconds
-
- Closed
-
[JENKINS-59652] [kubernetes plugin] Protect Jenkins agent pods from eviction
Description |
Original:
I have a sporadic bug occuring on my Jenkins installation for months now: ??java.net.ProtocolException: Expected HTTP 101 response but was '500 Internal Server Error'java.net.ProtocolException: Expected HTTP 101 response but was '500 Internal Server Error' at okhttp3.internal.ws.RealWebSocket.checkResponse(RealWebSocket.java:229) at okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:196) at okhttp3.RealCall$AsyncCall.execute(RealCall.java:206) at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)io.fabric8.kubernetes.client.KubernetesClientException: error dialing backend: EOF?? I believe it was already reported in these threads and I understand that this is caused by an HTTP 500 returned by the kubernetes API: - https://issues.jenkins-ci.org/browse/JENKINS-39844 - [https://stackoverflow.com/questions/50949718/kubernetes-gke-error-dialing-backend-eof-on-random-exec-command] However, after further investigation, I am sure now that the bug occurs only when the cluster autoscaler is on and more precisely when the autoscaler scales down while a Jenkins build is running. It maybe an edge case. I tried to set the annotation [cluster-autoscaler.kubernetes.io/safe-to-evict|https://www.google.com/url?q=http://cluster-autoscaler.kubernetes.io/safe-to-evict&sa=D&usg=AFQjCNE07XKOcvUk0J1yOtDq6Bs0JS7JsQ]: "false" on my jenkins slave pods but it didn't protect them. So I am trying now to setup a PodDisruptionBudget for each of my slave pod to protect them from eviction. However, when passing the PDB into the podTemplate yaml it is just totally ignored. How can I protect my jenkins slave pods from eviction? |
New:
I have a sporadic bug occuring on my Jenkins installation for months now: {noformat} java.net.ProtocolException: Expected HTTP 101 response but was '500 Internal Server Error' java.net.ProtocolException: Expected HTTP 101 response but was '500 Internal Server Error' at okhttp3.internal.ws.RealWebSocket.checkResponse(RealWebSocket.java:229) at okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:196) at okhttp3.RealCall$AsyncCall.execute(RealCall.java:206) at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) io.fabric8.kubernetes.client.KubernetesClientException: error dialing backend: EOF {noformat} I believe it was already reported in these threads and I understand that this is caused by an HTTP 500 returned by the kubernetes API: - https://issues.jenkins-ci.org/browse/JENKINS-39844 - [https://stackoverflow.com/questions/50949718/kubernetes-gke-error-dialing-backend-eof-on-random-exec-command] However, after further investigation, I am sure now that the bug occurs only when the cluster autoscaler is on and more precisely when the autoscaler scales down while a Jenkins build is running. It maybe an edge case. I tried to set the annotation [cluster-autoscaler.kubernetes.io/safe-to-evict|https://www.google.com/url?q=http://cluster-autoscaler.kubernetes.io/safe-to-evict&sa=D&usg=AFQjCNE07XKOcvUk0J1yOtDq6Bs0JS7JsQ]: "false" on my jenkins slave pods but it didn't protect them. So I am trying now to setup a PodDisruptionBudget for each of my slave pod to protect them from eviction. However, when passing the PDB into the podTemplate yaml it is just totally ignored. How can I protect my jenkins slave pods from eviction? |
Description |
Original:
I have a sporadic bug occuring on my Jenkins installation for months now: {noformat} java.net.ProtocolException: Expected HTTP 101 response but was '500 Internal Server Error' java.net.ProtocolException: Expected HTTP 101 response but was '500 Internal Server Error' at okhttp3.internal.ws.RealWebSocket.checkResponse(RealWebSocket.java:229) at okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:196) at okhttp3.RealCall$AsyncCall.execute(RealCall.java:206) at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) io.fabric8.kubernetes.client.KubernetesClientException: error dialing backend: EOF {noformat} I believe it was already reported in these threads and I understand that this is caused by an HTTP 500 returned by the kubernetes API: - https://issues.jenkins-ci.org/browse/JENKINS-39844 - [https://stackoverflow.com/questions/50949718/kubernetes-gke-error-dialing-backend-eof-on-random-exec-command] However, after further investigation, I am sure now that the bug occurs only when the cluster autoscaler is on and more precisely when the autoscaler scales down while a Jenkins build is running. It maybe an edge case. I tried to set the annotation [cluster-autoscaler.kubernetes.io/safe-to-evict|https://www.google.com/url?q=http://cluster-autoscaler.kubernetes.io/safe-to-evict&sa=D&usg=AFQjCNE07XKOcvUk0J1yOtDq6Bs0JS7JsQ]: "false" on my jenkins slave pods but it didn't protect them. So I am trying now to setup a PodDisruptionBudget for each of my slave pod to protect them from eviction. However, when passing the PDB into the podTemplate yaml it is just totally ignored. How can I protect my jenkins slave pods from eviction? |
New:
I have a sporadic bug occuring on my Jenkins installation for months now: {noformat} java.net.ProtocolException: Expected HTTP 101 response but was '500 Internal Server Error' at okhttp3.internal.ws.RealWebSocket.checkResponse(RealWebSocket.java:229) at okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:196) at okhttp3.RealCall$AsyncCall.execute(RealCall.java:206) at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) io.fabric8.kubernetes.client.KubernetesClientException: error dialing backend: EOF {noformat} I believe it was already reported in these threads and I understand that this is caused by an HTTP 500 returned by the kubernetes API: - https://issues.jenkins-ci.org/browse/JENKINS-39844 - [https://stackoverflow.com/questions/50949718/kubernetes-gke-error-dialing-backend-eof-on-random-exec-command] However, after further investigation, I am sure now that the bug occurs only when the cluster autoscaler is on and more precisely when the autoscaler scales down while a Jenkins build is running. It maybe an edge case. I tried to set the annotation [cluster-autoscaler.kubernetes.io/safe-to-evict|https://www.google.com/url?q=http://cluster-autoscaler.kubernetes.io/safe-to-evict&sa=D&usg=AFQjCNE07XKOcvUk0J1yOtDq6Bs0JS7JsQ]: "false" on my jenkins slave pods but it didn't protect them. So I am trying now to setup a PodDisruptionBudget for each of my slave pod to protect them from eviction. However, when passing the PDB into the podTemplate yaml it is just totally ignored. How can I protect my jenkins slave pods from eviction? |
Description |
Original:
I have a sporadic bug occuring on my Jenkins installation for months now: {noformat} java.net.ProtocolException: Expected HTTP 101 response but was '500 Internal Server Error' at okhttp3.internal.ws.RealWebSocket.checkResponse(RealWebSocket.java:229) at okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:196) at okhttp3.RealCall$AsyncCall.execute(RealCall.java:206) at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) io.fabric8.kubernetes.client.KubernetesClientException: error dialing backend: EOF {noformat} I believe it was already reported in these threads and I understand that this is caused by an HTTP 500 returned by the kubernetes API: - https://issues.jenkins-ci.org/browse/JENKINS-39844 - [https://stackoverflow.com/questions/50949718/kubernetes-gke-error-dialing-backend-eof-on-random-exec-command] However, after further investigation, I am sure now that the bug occurs only when the cluster autoscaler is on and more precisely when the autoscaler scales down while a Jenkins build is running. It maybe an edge case. I tried to set the annotation [cluster-autoscaler.kubernetes.io/safe-to-evict|https://www.google.com/url?q=http://cluster-autoscaler.kubernetes.io/safe-to-evict&sa=D&usg=AFQjCNE07XKOcvUk0J1yOtDq6Bs0JS7JsQ]: "false" on my jenkins slave pods but it didn't protect them. So I am trying now to setup a PodDisruptionBudget for each of my slave pod to protect them from eviction. However, when passing the PDB into the podTemplate yaml it is just totally ignored. How can I protect my jenkins slave pods from eviction? |
New:
I have a sporadic bug occuring on my Jenkins installation for months now: {noformat} java.net.ProtocolException: Expected HTTP 101 response but was '500 Internal Server Error' at okhttp3.internal.ws.RealWebSocket.checkResponse(RealWebSocket.java:229) at okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:196) at okhttp3.RealCall$AsyncCall.execute(RealCall.java:206) at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) io.fabric8.kubernetes.client.KubernetesClientException: error dialing backend: EOF {noformat} I believe it was already reported in these threads and I understand that this is caused by an HTTP 500 returned by the kubernetes API: - https://issues.jenkins-ci.org/browse/JENKINS-39844 - [https://stackoverflow.com/questions/50949718/kubernetes-gke-error-dialing-backend-eof-on-random-exec-command] However, after further investigation, I am sure now that the bug occurs only when the cluster autoscaler is on and more precisely when the autoscaler scales down while a Jenkins build is running. It maybe an edge case. I tried to set the annotation [cluster-autoscaler.kubernetes.io/safe-to-evict|https://www.google.com/url?q=http://cluster-autoscaler.kubernetes.io/safe-to-evict&sa=D&usg=AFQjCNE07XKOcvUk0J1yOtDq6Bs0JS7JsQ]: "false" on my jenkins slave pods but it didn't protect them. So I am trying now to setup a PodDisruptionBudget for each of my slave pod to protect them from eviction. However, when passing the PDB into the podTemplate yaml it is just totally ignored. How can I protect my jenkins slave pods from eviction? |
Description |
Original:
I have a sporadic bug occuring on my Jenkins installation for months now: {noformat} java.net.ProtocolException: Expected HTTP 101 response but was '500 Internal Server Error' at okhttp3.internal.ws.RealWebSocket.checkResponse(RealWebSocket.java:229) at okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:196) at okhttp3.RealCall$AsyncCall.execute(RealCall.java:206) at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) io.fabric8.kubernetes.client.KubernetesClientException: error dialing backend: EOF {noformat} I believe it was already reported in these threads and I understand that this is caused by an HTTP 500 returned by the kubernetes API: - https://issues.jenkins-ci.org/browse/JENKINS-39844 - [https://stackoverflow.com/questions/50949718/kubernetes-gke-error-dialing-backend-eof-on-random-exec-command] However, after further investigation, I am sure now that the bug occurs only when the cluster autoscaler is on and more precisely when the autoscaler scales down while a Jenkins build is running. It maybe an edge case. I tried to set the annotation [cluster-autoscaler.kubernetes.io/safe-to-evict|https://www.google.com/url?q=http://cluster-autoscaler.kubernetes.io/safe-to-evict&sa=D&usg=AFQjCNE07XKOcvUk0J1yOtDq6Bs0JS7JsQ]: "false" on my jenkins slave pods but it didn't protect them. So I am trying now to setup a PodDisruptionBudget for each of my slave pod to protect them from eviction. However, when passing the PDB into the podTemplate yaml it is just totally ignored. How can I protect my jenkins slave pods from eviction? |
New:
I have a sporadic bug occuring on my Jenkins installation for months now: {noformat} java.net.ProtocolException: Expected HTTP 101 response but was '500 Internal Server Error' at okhttp3.internal.ws.RealWebSocket.checkResponse(RealWebSocket.java:229) at okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:196) at okhttp3.RealCall$AsyncCall.execute(RealCall.java:206) at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) io.fabric8.kubernetes.client.KubernetesClientException: error dialing backend: EOF {noformat} I believe it was already reported in these threads and I understand that this is caused by an HTTP 500 returned by the kubernetes API: - https://issues.jenkins-ci.org/browse/JENKINS-39844 - [https://stackoverflow.com/questions/50949718/kubernetes-gke-error-dialing-backend-eof-on-random-exec-command] However, after further investigation, I am sure now that the bug occurs only when the cluster autoscaler is on and more precisely when the autoscaler scales down while a Jenkins build is running. It maybe an edge case. I tried to set the annotation [cluster-autoscaler.kubernetes.io/safe-to-evict|https://www.google.com/url?q=http://cluster-autoscaler.kubernetes.io/safe-to-evict&sa=D&usg=AFQjCNE07XKOcvUk0J1yOtDq6Bs0JS7JsQ]: "false" on my jenkins slave pods but it didn't protect them. So I am trying now to setup a PodDisruptionBudget for each of my slave pod to protect them from eviction. However, when passing the PDB into the podTemplate yaml it is just totally ignored. How can I protect my jenkins slave pods from eviction? |
Description |
Original:
I have a sporadic bug occuring on my Jenkins installation for months now: {noformat} java.net.ProtocolException: Expected HTTP 101 response but was '500 Internal Server Error' at okhttp3.internal.ws.RealWebSocket.checkResponse(RealWebSocket.java:229) at okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:196) at okhttp3.RealCall$AsyncCall.execute(RealCall.java:206) at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) io.fabric8.kubernetes.client.KubernetesClientException: error dialing backend: EOF {noformat} I believe it was already reported in these threads and I understand that this is caused by an HTTP 500 returned by the kubernetes API: - https://issues.jenkins-ci.org/browse/JENKINS-39844 - [https://stackoverflow.com/questions/50949718/kubernetes-gke-error-dialing-backend-eof-on-random-exec-command] However, after further investigation, I am sure now that the bug occurs only when the cluster autoscaler is on and more precisely when the autoscaler scales down while a Jenkins build is running. It maybe an edge case. I tried to set the annotation [cluster-autoscaler.kubernetes.io/safe-to-evict|https://www.google.com/url?q=http://cluster-autoscaler.kubernetes.io/safe-to-evict&sa=D&usg=AFQjCNE07XKOcvUk0J1yOtDq6Bs0JS7JsQ]: "false" on my jenkins slave pods but it didn't protect them. So I am trying now to setup a PodDisruptionBudget for each of my slave pod to protect them from eviction. However, when passing the PDB into the podTemplate yaml it is just totally ignored. How can I protect my jenkins slave pods from eviction? |
New:
I have a sporadic bug occuring on my Jenkins installation for months now: {noformat} java.net.ProtocolException: Expected HTTP 101 response but was '500 Internal Server Error' at okhttp3.internal.ws.RealWebSocket.checkResponse(RealWebSocket.java:229) at okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:196) at okhttp3.RealCall$AsyncCall.execute(RealCall.java:206) at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) io.fabric8.kubernetes.client.KubernetesClientException: error dialing backend: EOF {noformat} I believe it was already reported in these threads and I understand that this is caused by an HTTP 500 returned by the kubernetes API: - https://issues.jenkins-ci.org/browse/JENKINS-39844 - [https://stackoverflow.com/questions/50949718/kubernetes-gke-error-dialing-backend-eof-on-random-exec-command] However, after further investigation, I am sure now that the bug occurs only when the cluster autoscaler is on and more precisely when the autoscaler scales down while a Jenkins build is running. It maybe an edge case. I tried to set the annotation [cluster-autoscaler.kubernetes.io/safe-to-evict|https://www.google.com/url?q=http://cluster-autoscaler.kubernetes.io/safe-to-evict&sa=D&usg=AFQjCNE07XKOcvUk0J1yOtDq6Bs0JS7JsQ]: "false" on my jenkins slave pods but it didn't protect them. So I am trying now to setup a PodDisruptionBudget for each of my slave pod to protect them from eviction. However, when passing the PDB into the podTemplate yaml it is just totally ignored. How can I protect my jenkins slave pods from eviction? |
Description |
Original:
I have a sporadic bug occuring on my Jenkins installation for months now: {noformat} java.net.ProtocolException: Expected HTTP 101 response but was '500 Internal Server Error' at okhttp3.internal.ws.RealWebSocket.checkResponse(RealWebSocket.java:229) at okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:196) at okhttp3.RealCall$AsyncCall.execute(RealCall.java:206) at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) io.fabric8.kubernetes.client.KubernetesClientException: error dialing backend: EOF {noformat} I believe it was already reported in these threads and I understand that this is caused by an HTTP 500 returned by the kubernetes API: - https://issues.jenkins-ci.org/browse/JENKINS-39844 - [https://stackoverflow.com/questions/50949718/kubernetes-gke-error-dialing-backend-eof-on-random-exec-command] However, after further investigation, I am sure now that the bug occurs only when the cluster autoscaler is on and more precisely when the autoscaler scales down while a Jenkins build is running. It maybe an edge case. I tried to set the annotation [cluster-autoscaler.kubernetes.io/safe-to-evict|https://www.google.com/url?q=http://cluster-autoscaler.kubernetes.io/safe-to-evict&sa=D&usg=AFQjCNE07XKOcvUk0J1yOtDq6Bs0JS7JsQ]: "false" on my jenkins slave pods but it didn't protect them. So I am trying now to setup a PodDisruptionBudget for each of my slave pod to protect them from eviction. However, when passing the PDB into the podTemplate yaml it is just totally ignored. How can I protect my jenkins slave pods from eviction? |
New:
I have a sporadic bug occuring on my Jenkins installation for months now: {noformat} java.net.ProtocolException: Expected HTTP 101 response but was '500 Internal Server Error' at okhttp3.internal.ws.RealWebSocket.checkResponse(RealWebSocket.java:229) at okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:196) at okhttp3.RealCall$AsyncCall.execute(RealCall.java:206) at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) io.fabric8.kubernetes.client.KubernetesClientException: error dialing backend: EOF {noformat} I believe it was already reported in these threads and I understand that this is caused by an HTTP 500 returned by the kubernetes API: - https://issues.jenkins-ci.org/browse/JENKINS-39844 - [https://stackoverflow.com/questions/50949718/kubernetes-gke-error-dialing-backend-eof-on-random-exec-command] However, after further investigation, I am sure now that the bug occurs only when the cluster autoscaler is on and more precisely when the autoscaler scales down while a Jenkins build is running. It maybe an edge case. I tried to set the annotation [cluster-autoscaler.kubernetes.io/safe-to-evict|https://www.google.com/url?q=http://cluster-autoscaler.kubernetes.io/safe-to-evict&sa=D&usg=AFQjCNE07XKOcvUk0J1yOtDq6Bs0JS7JsQ]: "false" on my jenkins slave pods but it didn't protect them. So I am trying now to setup a PodDisruptionBudget for each of my slave pod to protect them from eviction. However, when passing the PDB into the podTemplate yaml it is just totally ignored. How can I protect my jenkins slave pods from eviction? |
Description |
Original:
I have a sporadic bug occuring on my Jenkins installation for months now: {noformat} java.net.ProtocolException: Expected HTTP 101 response but was '500 Internal Server Error' at okhttp3.internal.ws.RealWebSocket.checkResponse(RealWebSocket.java:229) at okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:196) at okhttp3.RealCall$AsyncCall.execute(RealCall.java:206) at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) io.fabric8.kubernetes.client.KubernetesClientException: error dialing backend: EOF {noformat} I believe it was already reported in these threads and I understand that this is caused by an HTTP 500 returned by the kubernetes API: - https://issues.jenkins-ci.org/browse/JENKINS-39844 - [https://stackoverflow.com/questions/50949718/kubernetes-gke-error-dialing-backend-eof-on-random-exec-command] However, after further investigation, I am sure now that the bug occurs only when the cluster autoscaler is on and more precisely when the autoscaler scales down while a Jenkins build is running. It maybe an edge case. I tried to set the annotation [cluster-autoscaler.kubernetes.io/safe-to-evict|https://www.google.com/url?q=http://cluster-autoscaler.kubernetes.io/safe-to-evict&sa=D&usg=AFQjCNE07XKOcvUk0J1yOtDq6Bs0JS7JsQ]: "false" on my jenkins slave pods but it didn't protect them. So I am trying now to setup a PodDisruptionBudget for each of my slave pod to protect them from eviction. However, when passing the PDB into the podTemplate yaml it is just totally ignored. How can I protect my jenkins slave pods from eviction? |
New:
I have a sporadic bug occuring on my Jenkins installation for months now: {noformat} java.net.ProtocolException: Expected HTTP 101 response but was '500 Internal Server Error' at okhttp3.internal.ws.RealWebSocket.checkResponse(RealWebSocket.java:229) at okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:196) at okhttp3.RealCall$AsyncCall.execute(RealCall.java:206) at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) io.fabric8.kubernetes.client.KubernetesClientException: error dialing backend: EOF {noformat} I believe it was already reported in these threads and I understand that this is caused by an HTTP 500 returned by the kubernetes API: - https://issues.jenkins-ci.org/browse/JENKINS-39844 - [https://stackoverflow.com/questions/50949718/kubernetes-gke-error-dialing-backend-eof-on-random-exec-command] However, after further investigation, I am sure now that the bug occurs only when the cluster autoscaler is on and more precisely when the autoscaler scales down while a Jenkins build is running. It maybe an edge case. To fix this, I set the annotation on all my pods in the podTemplate yaml: {noformat} cluster-autoscaler.kubernetes.io/safe-to-evict: "false" {noformat} However, it didn't protect them. So I am trying now to setup a PodDisruptionBudget for each of my slave pod to protect them from eviction. But, when passing the PDB into the podTemplate yaml it is just totally ignored. How can I protect my jenkins slave pods from eviction? |
Assignee | Original: Carlos Sanchez [ csanchez ] |
Link | New: This issue relates to JENKINS-64848 [ JENKINS-64848 ] |
Link | New: This issue is duplicated by JENKINS-67167 [ JENKINS-67167 ] |