Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-64628

Stop using random PodTemplate.id (was: Kubernetes Plugin IllegalStateException: Not expecting pod template to be null at this point)

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Major Major
    • kubernetes-plugin
    • Jenkins: 2.263.1
      jenkins/inbound-agent:4.6-1
      kubernetes-plugin: 1.28.5
      durable-task-plugin: 1.35

      Jobs do not resume after a master restart. Issue present in Jenkins 2.263.1 with kubernetes-plugin 1.28.5. 

       
      job console log:

      Running on jenkins-agent-test-experimental-qtmr6 in /home/jenkins/agent/workspace/test
      [Pipeline] {
      [Pipeline] stage
      [Pipeline] { (long-running-job)
      [Pipeline] sh
      Resuming build at Thu Jan 14 18:44:48 UTC 2021 after Jenkins restart
      Waiting to resume part of test #14: ‘jenkins-agent-test-experimental-qtmr6’ is offline
      Waiting to resume part of test #14: ‘jenkins-agent-test-experimental-qtmr6’ is offline
      Waiting to resume part of test #14: ‘jenkins-agent-test-experimental-qtmr6’ is offline
      Waiting to resume part of test #14: ‘jenkins-agent-test-experimental-qtmr6’ is offline
      Waiting to resume part of test #14: ‘jenkins-agent-test-experimental-qtmr6’ is offline
      Waiting to resume part of test #14: ‘jenkins-agent-test-experimental-qtmr6’ is offline
      Waiting to resume part of test #14: ‘jenkins-agent-test-experimental-qtmr6’ is offline
      Waiting to resume part of test #14: ‘jenkins-agent-test-experimental-qtmr6’ is offline
      Ready to run at Thu Jan 14 18:46:49 UTC 2021
      

       
      logs from agent: * remoting.log.1:

       

      Jan 14, 2021 6:39:59 PM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Locating server among [http://jenkins-svc-default-experimental:8080/]
      Jan 14, 2021 6:39:59 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver resolve
      INFO: Remoting server accepts the following protocols: [JNLP4-connect, Ping]
      Jan 14, 2021 6:39:59 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver resolve
      INFO: Remoting TCP connection tunneling is enabled. Skipping the TCP Agent Listener Port availability check
      Jan 14, 2021 6:39:59 PM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Agent discovery successful
        Agent address: jenkins-svc-default-experimental
        Agent port:    50000
        Identity:      af:93:9a:2b:e2:a2:16:cc:f4:07:df:8b:e8:66:24:80
      Jan 14, 2021 6:39:59 PM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Handshaking
      Jan 14, 2021 6:39:59 PM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Connecting to jenkins-svc-default-experimental:50000
      Jan 14, 2021 6:40:00 PM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Trying protocol: JNLP4-connect
      Jan 14, 2021 6:40:02 PM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Remote identity confirmed: af:93:9a:2b:e2:a2:16:cc:f4:07:df:8b:e8:66:24:80
      Jan 14, 2021 6:40:10 PM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Connected
      Jan 14, 2021 6:41:40 PM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Terminated
      Jan 14, 2021 6:41:51 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver waitForReady
      INFO: Failed to connect to the master. Will try again: java.net.ConnectException Connection refused (Connection refused)
      Jan 14, 2021 6:42:02 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver waitForReady
      INFO: Failed to connect to the master. Will try again: java.net.ConnectException Connection refused (Connection refused)
      Jan 14, 2021 6:42:14 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver waitForReady
      INFO: Failed to connect to the master. Will try again: java.net.ConnectException Connection refused (Connection refused)
      Jan 14, 2021 6:42:25 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver waitForReady
      INFO: Failed to connect to the master. Will try again: java.net.ConnectException Connection refused (Connection refused)
      Jan 14, 2021 6:42:36 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver waitForReady
      INFO: Failed to connect to the master. Will try again: java.net.ConnectException Connection refused (Connection refused)
      Jan 14, 2021 6:42:46 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver waitForReady
      INFO: Failed to connect to the master. Will try again: java.net.ConnectException Connection refused (Connection refused)
      Jan 14, 2021 6:42:58 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver waitForReady
      INFO: Master isn't ready to talk to us on http://jenkins-svc-default-experimental:8080/tcpSlaveAgentListener/. Will try again: response code=503
      Jan 14, 2021 6:43:08 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver waitForReady
      INFO: Master isn't ready to talk to us on http://jenkins-svc-default-experimental:8080/tcpSlaveAgentListener/. Will try again: response code=503
      Jan 14, 2021 6:43:18 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver waitForReady
      INFO: Master isn't ready to talk to us on http://jenkins-svc-default-experimental:8080/tcpSlaveAgentListener/. Will try again: response code=503
      Jan 14, 2021 6:43:28 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver waitForReady
      INFO: Master isn't ready to talk to us on http://jenkins-svc-default-experimental:8080/tcpSlaveAgentListener/. Will try again: response code=503
      Jan 14, 2021 6:43:38 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver waitForReady
      INFO: Master isn't ready to talk to us on http://jenkins-svc-default-experimental:8080/tcpSlaveAgentListener/. Will try again: response code=503
      Jan 14, 2021 6:43:48 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver waitForReady
      INFO: Master isn't ready to talk to us on http://jenkins-svc-default-experimental:8080/tcpSlaveAgentListener/. Will try again: response code=503
      Jan 14, 2021 6:43:59 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver waitForReady
      INFO: Master isn't ready to talk to us on http://jenkins-svc-default-experimental:8080/tcpSlaveAgentListener/. Will try again: response code=503
      Jan 14, 2021 6:44:09 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver waitForReady
      INFO: Master isn't ready to talk to us on http://jenkins-svc-default-experimental:8080/tcpSlaveAgentListener/. Will try again: response code=503
      Jan 14, 2021 6:44:19 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver waitForReady
      INFO: Master isn't ready to talk to us on http://jenkins-svc-default-experimental:8080/tcpSlaveAgentListener/. Will try again: response code=503
      Jan 14, 2021 6:44:29 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver waitForReady
      INFO: Master isn't ready to talk to us on http://jenkins-svc-default-experimental:8080/tcpSlaveAgentListener/. Will try again: response code=503
      Jan 14, 2021 6:44:39 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver waitForReady
      INFO: Master isn't ready to talk to us on http://jenkins-svc-default-experimental:8080/tcpSlaveAgentListener/. Will try again: response code=503
      Jan 14, 2021 6:44:50 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver waitForReady
      INFO: Master isn't ready to talk to us on http://jenkins-svc-default-experimental:8080/tcpSlaveAgentListener/. Will try again: response code=503
      Jan 14, 2021 6:45:00 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver waitForReady
      INFO: Master isn't ready to talk to us on http://jenkins-svc-default-experimental:8080/tcpSlaveAgentListener/. Will try again: response code=503
      Jan 14, 2021 6:45:11 PM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Performing onReconnect operation.
      Jan 14, 2021 6:45:11 PM jenkins.slaves.restarter.JnlpSlaveRestarterInstaller$FindEffectiveRestarters$1 onReconnect
      INFO: Restarting agent via jenkins.slaves.restarter.UnixSlaveRestarter@39ad2ce
      

       

      • remoting.log.0 

       

      Jan 14, 2021 6:45:30 PM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Locating server among [http://jenkins-svc-default-experimental:8080/]
      Jan 14, 2021 6:45:31 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver resolve
      INFO: Remoting server accepts the following protocols: [JNLP4-connect, Ping]
      Jan 14, 2021 6:45:31 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver resolve
      INFO: Remoting TCP connection tunneling is enabled. Skipping the TCP Agent Listener Port availability check
      Jan 14, 2021 6:45:31 PM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Agent discovery successful
        Agent address: jenkins-svc-default-experimental
        Agent port:    50000
        Identity:      af:93:9a:2b:e2:a2:16:cc:f4:07:df:8b:e8:66:24:80
      Jan 14, 2021 6:45:31 PM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Handshaking
      Jan 14, 2021 6:45:31 PM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Connecting to jenkins-svc-default-experimental:50000
      Jan 14, 2021 6:45:31 PM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Trying protocol: JNLP4-connect
      Jan 14, 2021 6:45:34 PM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Remote identity confirmed: af:93:9a:2b:e2:a2:16:cc:f4:07:df:8b:e8:66:24:80
      Jan 14, 2021 6:45:43 PM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Connected
      Jan 14, 2021 6:46:56 PM org.jenkinsci.plugins.durabletask.FileMonitoringTask$Watcher run
      WARNING: giving up on watching /home/jenkins/agent/workspace/test@tmp/durable-2321c62b
      java.lang.IllegalStateException: Not expecting pod template to be null at this point
      	at org.csanchez.jenkins.plugins.kubernetes.KubernetesSlave.getTemplate(KubernetesSlave.java:92)
      	at org.csanchez.jenkins.plugins.kubernetes.pipeline.SecretsMasker$Factory.secretsOf(SecretsMasker.java:144)
      	at org.csanchez.jenkins.plugins.kubernetes.pipeline.SecretsMasker$Factory.get(SecretsMasker.java:122)
      	at org.csanchez.jenkins.plugins.kubernetes.pipeline.SecretsMasker$Factory.get(SecretsMasker.java:94)
      	at org.jenkinsci.plugins.workflow.steps.DynamicContext$Typed.get(DynamicContext.java:94)
      	at org.jenkinsci.plugins.workflow.cps.ContextVariableSet.get(ContextVariableSet.java:139)
      	at org.jenkinsci.plugins.workflow.cps.CpsThread.getContextVariable(CpsThread.java:135)
      	at org.jenkinsci.plugins.workflow.cps.CpsStepContext.doGet(CpsStepContext.java:297)
      	at org.jenkinsci.plugins.workflow.support.DefaultStepContext.get(DefaultStepContext.java:75)
      	at org.jenkinsci.plugins.workflow.support.DefaultStepContext.getListener(DefaultStepContext.java:127)
      	at org.jenkinsci.plugins.workflow.support.DefaultStepContext.get(DefaultStepContext.java:87)
      	at org.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep$Execution.exited(DurableTaskStep.java:623)
      	at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      	at jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
      	at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at java.lang.reflect.Method.invoke(Method.java:566)
      	at hudson.remoting.RemoteInvocationHandler$RPCRequest.perform(RemoteInvocationHandler.java:936)
      	at hudson.remoting.RemoteInvocationHandler$RPCRequest.call(RemoteInvocationHandler.java:909)
      	at hudson.remoting.RemoteInvocationHandler$RPCRequest.call(RemoteInvocationHandler.java:860)
      	at hudson.remoting.UserRequest.perform(UserRequest.java:211)
      	at hudson.remoting.UserRequest.perform(UserRequest.java:54)
      	at hudson.remoting.Request$2.run(Request.java:375)
      	at hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:73)
      	at org.jenkinsci.remoting.CallableDecorator.call(CallableDecorator.java:18)
      	at hudson.remoting.CallableDecoratorList$1.call(CallableDecoratorList.java:22)
      	at jenkins.util.ContextResettingExecutorService$2.call(ContextResettingExecutorService.java:46)
      	at jenkins.security.ImpersonatingExecutorService$2.call(ImpersonatingExecutorService.java:71)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:264)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
      	at java.lang.Thread.run(Thread.java:834)
      	Suppressed: hudson.remoting.Channel$CallSiteStackTrace: Remote call to JNLP4-connect connection to jenkins-svc-default-experimental/172.20.249.199:50000
      		at hudson.remoting.Channel.attachCallSiteStackTrace(Channel.java:1800)
      		at hudson.remoting.UserRequest$ExceptionResponse.retrieve(UserRequest.java:356)
      		at hudson.remoting.Channel.call(Channel.java:1001)
      		at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:286)
      		at org.jenkinsci.plugins.workflow.steps.durable_task.$Proxy9.exited(Unknown Source)
      		at org.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep$HandlerImpl.exited(DurableTaskStep.java:743)
      		at org.jenkinsci.plugins.durabletask.FileMonitoringTask$Watcher.run(FileMonitoringTask.java:531)
      		at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
      		at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      		at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
      		at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
      		at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
      		at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
      		at java.lang.Thread.run(Thread.java:748)
      ...
      

       

      logs from Log Recorder for `org.csanchez.jenkins.plugins.kubernetes` on Jenkins master:

      Jan 14, 2021 6:39:52 PM FINEST org.csanchez.jenkins.plugins.kubernetes.AllContainersRunningPodWatcher
      [MODIFIED] jenkins-agent-test-experimental-qtmr6
      Jan 14, 2021 6:39:52 PM FINE org.csanchez.jenkins.plugins.kubernetes.AllContainersRunningPodWatcher
      All containers are running for pod jenkins-agent-test-experimental-qtmr6
      Jan 14, 2021 6:39:52 PM FINE org.csanchez.jenkins.plugins.kubernetes.TaskListenerEventWatcher
      TaskListenerEventWatcher onClose: jenkins-agent-test-experimental-qtmr6
      Jan 14, 2021 6:39:52 PM INFO org.csanchez.jenkins.plugins.kubernetes.KubernetesLauncher launch
      Pod is running: k8s-experimental default/jenkins-agent-test-experimental-qtmr6
      

          [JENKINS-64628] Stop using random PodTemplate.id (was: Kubernetes Plugin IllegalStateException: Not expecting pod template to be null at this point)

          Even I observe the same issue and now it is happening everyday even without restart of jenkins. This is kind of blocker and as a workaround we need to remove the agent and crease a new one. This is not solving the purpose of k8s agents.

          Issue still exists with 1.28.5 version. 

          Jenkins version: 2.266

          kubernetes plugin version: 1.28.5

          java.lang.IllegalStateException: Not expecting pod template to be null at this point
          at org.csanchez.jenkins.plugins.kubernetes.KubernetesSlave.getTemplate(KubernetesSlave.java:92)
          at org.csanchez.jenkins.plugins.kubernetes.pipeline.SecretsMasker$Factory.secretsOf(SecretsMasker.java:144)
          at org.csanchez.jenkins.plugins.kubernetes.pipeline.SecretsMasker$Factory.get(SecretsMasker.java:122)
          at org.csanchez.jenkins.plugins.kubernetes.pipeline.SecretsMasker$Factory.get(SecretsMasker.java:94)
          at org.jenkinsci.plugins.workflow.steps.DynamicContext$Typed.get(DynamicContext.java:94)
          at org.jenkinsci.plugins.workflow.cps.ContextVariableSet.get(ContextVariableSet.java:139)
          at org.jenkinsci.plugins.workflow.cps.CpsThread.getContextVariable(CpsThread.java:135)
          at org.jenkinsci.plugins.workflow.cps.CpsStepContext.doGet(CpsStepContext.java:297)
          at org.jenkinsci.plugins.workflow.support.DefaultStepContext.get(DefaultStepContext.java:75)
          at org.jenkinsci.plugins.workflow.support.DefaultStepContext.getListener(DefaultStepContext.java:127)
          at org.jenkinsci.plugins.workflow.support.DefaultStepContext.get(DefaultStepContext.java:79)
          at org.jenkinsci.plugins.workflow.cps.DSL.invokeStep(DSL.java:258)
          at org.jenkinsci.plugins.workflow.cps.DSL.invokeMethod(DSL.java:193)
          at org.jenkinsci.plugins.workflow.cps.CpsScript.invokeMethod(CpsScript.java:122)
          at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
          at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

          Sithik Settu Mohamed added a comment - Even I observe the same issue and now it is happening everyday even without restart of jenkins. This is kind of blocker and as a workaround we need to remove the agent and crease a new one. This is not solving the purpose of k8s agents. Issue still exists with 1.28.5 version.  Jenkins version: 2.266 kubernetes plugin version: 1.28.5 java.lang.IllegalStateException: Not expecting pod template to be null at this point at org.csanchez.jenkins.plugins.kubernetes.KubernetesSlave.getTemplate(KubernetesSlave.java:92) at org.csanchez.jenkins.plugins.kubernetes.pipeline.SecretsMasker$Factory.secretsOf(SecretsMasker.java:144) at org.csanchez.jenkins.plugins.kubernetes.pipeline.SecretsMasker$Factory.get(SecretsMasker.java:122) at org.csanchez.jenkins.plugins.kubernetes.pipeline.SecretsMasker$Factory.get(SecretsMasker.java:94) at org.jenkinsci.plugins.workflow.steps.DynamicContext$Typed.get(DynamicContext.java:94) at org.jenkinsci.plugins.workflow.cps.ContextVariableSet.get(ContextVariableSet.java:139) at org.jenkinsci.plugins.workflow.cps.CpsThread.getContextVariable(CpsThread.java:135) at org.jenkinsci.plugins.workflow.cps.CpsStepContext.doGet(CpsStepContext.java:297) at org.jenkinsci.plugins.workflow.support.DefaultStepContext.get(DefaultStepContext.java:75) at org.jenkinsci.plugins.workflow.support.DefaultStepContext.getListener(DefaultStepContext.java:127) at org.jenkinsci.plugins.workflow.support.DefaultStepContext.get(DefaultStepContext.java:79) at org.jenkinsci.plugins.workflow.cps.DSL.invokeStep(DSL.java:258) at org.jenkinsci.plugins.workflow.cps.DSL.invokeMethod(DSL.java:193) at org.jenkinsci.plugins.workflow.cps.CpsScript.invokeMethod(CpsScript.java:122) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

          Lars Berntzon added a comment - - edited

          Verified the problem is still there on kubernets-plugin 1.28.7 running jenkins/jenkins:2.277-jdk11 master image and  jenkins/inbound-agent:4.6-1-jdk11 agent image on a helm jenkins/jenkins release 2.15.1

          Not this is If agents are not stopped directly after a job but are retained for a while, then restarting Jenkins will make those agents unusable.
           

          Lars Berntzon added a comment - - edited Verified the problem is still there on kubernets-plugin 1.28.7 running jenkins/jenkins:2.277-jdk11 master image and  jenkins/inbound-agent:4.6-1-jdk11 agent image on a helm jenkins/jenkins release 2.15.1 Not this is If agents are not stopped directly after a job but are retained for a while, then restarting Jenkins will make those agents unusable.  

          Chris Nelson added a comment - - edited

          We have been seeing this issue as well on the most recent LTS and plugin releases and believe it's due to our use of the  configuation-as-code plugin to configure this plugin.

           The issue being:

          Jenkins starts (for the first time) the casc plugin runs and configures kubernetes plugin and our pod template

          Builds work fine

          Jenkins restarts, the casc plugin runs again and configures kubernetes plugin and our pod template (changing the pod template id when it recreates it)

          The existing agents now have a pod template id that doesn't match and this issue occurs.

           

          Chris Nelson added a comment - - edited We have been seeing this issue as well on the most recent LTS and plugin releases and believe it's due to our use of the  configuation-as-code plugin to configure this plugin.  The issue being: Jenkins starts (for the first time) the casc plugin runs and configures kubernetes plugin and our pod template Builds work fine Jenkins restarts, the casc plugin runs again and configures kubernetes plugin and our pod template (changing the pod template id when it recreates it) The existing agents now have a pod template id that doesn't match and this issue occurs.  

          Vincent Latombe added a comment - - edited

          Thanks for the intel cncult. I can check whether the plugin can handle this case better (unsure because casc uses the same endpoints as the UI), but an easy workaround should be to specify a unique id for each pod template in the casc configuration file.

          Vincent Latombe added a comment - - edited Thanks for the intel cncult . I can check whether the plugin can handle this case better (unsure because casc uses the same endpoints as the UI), but an easy workaround should be to specify a unique id for each pod template in the casc configuration file.

          Just ran a few tests here. I can confirm that keeping a fixed pod template ID between jenkins controller restarts is an effective workaround. The ID needs to be included in the JCasC cloud configuration.

          For folks running Jenkins inside Kubernetes, the official helm chart does not yet support this. I'm on my way to provide a PR. If that is approved, one extra line in the your values.yaml should be enough to workaround the issue until this is resolved.

          Julio Morimoto added a comment - Just ran a few tests here. I can confirm that keeping a fixed pod template ID between jenkins controller restarts is an effective workaround. The ID needs to be included in the JCasC cloud configuration. For folks running Jenkins inside Kubernetes, the official helm chart does not yet support this. I'm on my way to provide a PR. If that is approved, one extra line in the your values.yaml should be enough to workaround the issue until this is resolved.

          I'm hoping this PR will make it through into the helm chart. It should provide a smooth workaround, other than restarting agent pods on every controller restart.

          https://github.com/jenkinsci/helm-charts/pull/271

          Julio Morimoto added a comment - I'm hoping this PR will make it through into the helm chart. It should provide a smooth workaround, other than restarting agent pods on every controller restart. https://github.com/jenkinsci/helm-charts/pull/271

          Julio Morimoto added a comment - - edited

          The PR was approved. In its next release the helm chart should provide a smoother experience.

          The implementation was quite simple. The helm chart will create a unique pod template ID hashed from the Yaml contents from the respective keys in the values.yaml file. Consistent IDs are created for the default agent and for each additionalAgents defined.

          More importantly, this is still a workaround. If you change agent definitions in the values.yaml file and redeploy, JCasC will receive new pod template IDs, and a controller restart will result in the same error. But, until this is properly handled in the kubernetes-plugin, redeploying other unrelated JCasC configs and restarting the controller with the same pod templates should no longer be a problem.

          Julio Morimoto added a comment - - edited The PR was approved. In its next release the helm chart should provide a smoother experience. The implementation was quite simple. The helm chart will create a unique pod template ID hashed from the Yaml contents from the respective keys in the values.yaml file. Consistent IDs are created for the default agent and for each additionalAgents defined. More importantly, this is still a workaround. If you change agent definitions in the values.yaml file and redeploy, JCasC will receive new pod template IDs, and a controller restart will result in the same error. But, until this is properly handled in the kubernetes-plugin, redeploying other unrelated JCasC configs and restarting the controller with the same pod templates should no longer be a problem.

          Just tested version 3.2.0 of the official helm chart, and it works around as expected. There is actually no need to change your deployed values.yaml file. Internally, the chart calculates more consistent pod template ids.

          Controller restarts are no longer an issue for us unless the pod template in the values.yaml file changes.

          Julio Morimoto added a comment - Just tested version 3.2.0 of the official helm chart, and it works around as expected. There is actually no need to change your deployed values.yaml file. Internally, the chart calculates more consistent pod template ids. Controller restarts are no longer an issue for us unless the pod template in the values.yaml file changes.

          Jesse Glick added a comment -

          mdalton_ti’s log messages suggest use of USE_WATCHING mode (JENKINS-52165), perhaps implicitly via pipeline-cloudwatch-logs plugin?

          Jesse Glick added a comment - mdalton_ti ’s log messages suggest use of USE_WATCHING mode ( JENKINS-52165 ), perhaps implicitly via pipeline-cloudwatch-logs plugin?

          Jesse Glick added a comment -

          Some users report problems with configuration-as-code. (Not clear if that is the case for all reporters.) I would say this is because https://github.com/jenkinsci/kubernetes-plugin/blob/87e1dee008a5bc708b4b865fdde03672b648ef09/src/main/java/org/csanchez/jenkins/plugins/kubernetes/PodTemplate.java#L880-L883 is reasonable behavior but https://github.com/jenkinsci/kubernetes-plugin/blob/87e1dee008a5bc708b4b865fdde03672b648ef09/src/main/java/org/csanchez/jenkins/plugins/kubernetes/PodTemplate.java#L217 is not. Recommend deleting https://github.com/jenkinsci/kubernetes-plugin/blob/87e1dee008a5bc708b4b865fdde03672b648ef09/src/main/resources/org/csanchez/jenkins/plugins/kubernetes/PodTemplate/config.jelly#L8-L10 and the id field, and reimplementing getId to be a hash of data fields. Generally speaking, any hidden random id fields in Jenkins configuration structures are conceptually incompatible with configuration-as-code. (Same is true for IdCredentials etc.)

          Jesse Glick added a comment - Some users report problems with configuration-as-code . (Not clear if that is the case for all reporters.) I would say this is because https://github.com/jenkinsci/kubernetes-plugin/blob/87e1dee008a5bc708b4b865fdde03672b648ef09/src/main/java/org/csanchez/jenkins/plugins/kubernetes/PodTemplate.java#L880-L883 is reasonable behavior but https://github.com/jenkinsci/kubernetes-plugin/blob/87e1dee008a5bc708b4b865fdde03672b648ef09/src/main/java/org/csanchez/jenkins/plugins/kubernetes/PodTemplate.java#L217 is not. Recommend deleting https://github.com/jenkinsci/kubernetes-plugin/blob/87e1dee008a5bc708b4b865fdde03672b648ef09/src/main/resources/org/csanchez/jenkins/plugins/kubernetes/PodTemplate/config.jelly#L8-L10 and the id field, and reimplementing getId to be a hash of data fields. Generally speaking, any hidden random id fields in Jenkins configuration structures are conceptually incompatible with configuration-as-code . (Same is true for IdCredentials etc.)

            vlatombe Vincent Latombe
            mdalton_ti Michael
            Votes:
            1 Vote for this issue
            Watchers:
            13 Start watching this issue

              Created:
              Updated: