• Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Minor Minor
    • kubernetes-plugin
    • None

      I am having trouble launching a second k8s based agent into a new cluster.

      My use case is that I have a "shared" cluster (where Jenkins lives) where agents usually run and I have additional clusters for dev/tst/etc. Agents in shared (using the Kubernetes plugin) cluster run CI and then deploy built app containers to the dev cluster (for example) using kubectl. Once an app container is running I want to be able to launch a new agent into the [dev] cluster using the plugin to run some tests.

      It looks like I should just be able to set up the second cloud in Jenkins (we use config code) and the reference it in the top of the pod definition in the pipeline when I want to launch the new agent.

      However, when I do the agent tries to launch in the existing cluster ("shared" in this example) rather than the cloud I prescribe.

      In the logs I see:

      ```
      Oct 17, 2019 10:27:21 AM INFO org.csanchez.jenkins.plugins.kubernetes.KubernetesCloud provision
      Template for label jknode-devops-repotemplate-dotnetcore-test-15: Kubernetes Pod Template
      Oct 17, 2019 10:27:21 AM FINE org.csanchez.jenkins.plugins.kubernetes.KubernetesCloud
      Building connection to Kubernetes primary-aks-eurw-dev URL https://primary-aks-eurw-dev-wr-aasda324.hcp.westeurope.azmk8s.io:443/ namespace null
      Oct 17, 2019 10:27:21 AM FINE org.csanchez.jenkins.plugins.kubernetes.KubernetesCloud
      Connected to Kubernetes primary-aks-eurw-dev URL https://primary-aks-eurw-shared-wr-dasdad242.hcp.westeurope.azmk8s.io:443/

      ```

      At a glance in the source code, it looks like the plugin is pulling the incorrect client out of the cache. As you can see from the logs it looks like it starts to build the connection to the dev cluster and then retrieves the client for the shared cluster and uses that.

          [JENKINS-59826] Error launching agent into second cloud

          csanchez I'm trying to diagnose this issue and propse a fix, but I'm blocked by:

          https://groups.google.com/d/msg/jenkinsci-users/vkMWirNTuTQ/j2luFwpNCgAJ

          Any idea how to overcome this so I can fix this issue?

          Andrew Cameron-Douglas added a comment - csanchez I'm trying to diagnose this issue and propse a fix, but I'm blocked by: https://groups.google.com/d/msg/jenkinsci-users/vkMWirNTuTQ/j2luFwpNCgAJ Any idea how to overcome this so I can fix this issue?

          I answered the email. Do you have your Jenkinsfile to see how are you configuring the pod?

          Carlos Sanchez added a comment - I answered the email. Do you have your Jenkinsfile to see how are you configuring the pod?

          Andrew Cameron-Douglas added a comment - - edited

          Thanks csanchez! I'm just looking into the reasons for the failures a little deeper.

           

          The Jenkins file is actually part of a shared library. Two files of that are the deploy.groovy file and the TestRunner.groovy file. These spin up an agent for use during deployment (deploy.groovy) and then spin up an agent for testing in the target cluster at the end of the deployment (TestRunner.groovy). Please find these attached (abridged).

          I'll be working on this most of the day so feel free to respond and I'll get back to you asap.  

           

           

          TestRunner.groovy

          Andrew Cameron-Douglas added a comment - - edited Thanks csanchez ! I'm just looking into the reasons for the failures a little deeper.   The Jenkins file is actually part of a shared library. Two files of that are the deploy.groovy file and the TestRunner.groovy file. These spin up an agent for use during deployment (deploy.groovy) and then spin up an agent for testing in the target cluster at the end of the deployment (TestRunner.groovy). Please find these attached (abridged). I'll be working on this most of the day so feel free to respond and I'll get back to you asap.       TestRunner.groovy

          you are nesting two podTemplates, you can't do that to launch a pod. It creates a pod template in jenkins, not launch a new pod
          you should create a new job for the tests and call it, instead of using podTemplate again

          Carlos Sanchez added a comment - you are nesting two podTemplates, you can't do that to launch a pod. It creates a pod template in jenkins, not launch a new pod you should create a new job for the tests and call it, instead of using podTemplate again

          I have pulled the second pod out into a separate job, I still get the same error.

          This occurs irrespective of whether I am triggering the job from another job, or manually.

           

          Oct 23, 2019 10:06:11 AM INFO org.csanchez.jenkins.plugins.kubernetes.KubernetesCloud provision
          Template for label jknode-devops-repotemplate-dotnetcore-test-3: Kubernetes Pod Template
          
          Oct 23, 2019 10:06:11 AM FINE org.csanchez.jenkins.plugins.kubernetes.KubernetesCloud
          Building connection to Kubernetes primary-aks-eurw-dev URL  https://primary-aks-eurw-dev-wr-aasda324.hcp.westeurope.azmk8s.io namespace null
          
          Oct 23, 2019 10:06:11 AM FINE org.csanchez.jenkins.plugins.kubernetes.KubernetesCloud
          Connected to Kubernetes primary-aks-eurw-dev URL https://primary-aks-eurw-shared-wr-dasdad242.hcp.westeurope.azmk8s.io:443/
          
          Oct 23, 2019 10:06:11 AM WARNING org.csanchez.jenkins.plugins.kubernetes.KubernetesCloud provisionFailed to count the # of live instances on Kubernetesio.fabric8.kubernetes.client.KubernetesClientException: Failure executing: GET at: https://primary-aks-eurw-shared-wr-dasdad242.hcp.westeurope.azmk8s.io/api/v1/namespaces/devops-repotemplate-dotnetcore/pods?labelSelector=jenkins%3Dslave. Message: Forbidden!Configured service account doesn't have access. Service account may have been revoked. pods is forbidden: User "system:serviceaccount:jenkins-dev:jenkins" cannot list resource "pods" in API group "" in the namespace "devops-repotemplate-dotnetcore". at io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:510) at io.fabric8.kubernetes.client.dsl.base.OperationSupport.assertResponseCode(OperationSupport.java:447) at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:413) at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:372) at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:354) at io.fabric8.kubernetes.client.dsl.base.BaseOperation.listRequestHelper(BaseOperation.java:147) at io.fabric8.kubernetes.client.dsl.base.BaseOperation.list(BaseOperation.java:614) at io.fabric8.kubernetes.client.dsl.base.BaseOperation.list(BaseOperation.java:63) at org.csanchez.jenkins.plugins.kubernetes.KubernetesCloud.getActiveSlavePods(KubernetesCloud.java:581) at org.csanchez.jenkins.plugins.kubernetes.KubernetesCloud.addProvisionedSlave(KubernetesCloud.java:556) at org.csanchez.jenkins.plugins.kubernetes.KubernetesCloud.provision(KubernetesCloud.java:508) at hudson.slaves.NodeProvisioner$StandardStrategyImpl.apply(NodeProvisioner.java:729) at hudson.slaves.NodeProvisioner.update(NodeProvisioner.java:332) at hudson.slaves.NodeProvisioner.access$900(NodeProvisioner.java:63) at hudson.slaves.NodeProvisioner$NodeProvisionerInvoker.doRun(NodeProvisioner.java:823) at hudson.triggers.SafeTimerTask.run(SafeTimerTask.java:72) at jenkins.security.ImpersonatingScheduledExecutorService$1.run(ImpersonatingScheduledExecutorService.java:58) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)

          Andrew Cameron-Douglas added a comment - I have pulled the second pod out into a separate job, I still get the same error. This occurs irrespective of whether I am triggering the job from another job, or manually.   Oct 23, 2019 10:06:11 AM INFO org.csanchez.jenkins.plugins.kubernetes.KubernetesCloud provision Template for label jknode-devops-repotemplate-dotnetcore-test-3: Kubernetes Pod Template Oct 23, 2019 10:06:11 AM FINE org.csanchez.jenkins.plugins.kubernetes.KubernetesCloud Building connection to Kubernetes primary-aks-eurw-dev URL  https://primary-aks-eurw-dev-wr-aasda324.hcp.westeurope.azmk8s.io namespace null Oct 23, 2019 10:06:11 AM FINE org.csanchez.jenkins.plugins.kubernetes.KubernetesCloud Connected to Kubernetes primary-aks-eurw-dev URL https://primary-aks-eurw-shared-wr-dasdad242.hcp.westeurope.azmk8s.io:443/ Oct 23, 2019 10:06:11 AM WARNING org.csanchez.jenkins.plugins.kubernetes.KubernetesCloud provisionFailed to count the # of live instances on Kubernetesio.fabric8.kubernetes.client.KubernetesClientException: Failure executing: GET at: https://primary-aks-eurw-shared-wr-dasdad242.hcp.westeurope.azmk8s.io/api/v1/namespaces/devops-repotemplate-dotnetcore/pods?labelSelector=jenkins%3Dslave. Message: Forbidden!Configured service account doesn't have access. Service account may have been revoked. pods is forbidden: User "system:serviceaccount:jenkins-dev:jenkins" cannot list resource "pods" in API group "" in the namespace "devops-repotemplate-dotnetcore". at io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:510) at io.fabric8.kubernetes.client.dsl.base.OperationSupport.assertResponseCode(OperationSupport.java:447) at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:413) at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:372) at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:354) at io.fabric8.kubernetes.client.dsl.base.BaseOperation.listRequestHelper(BaseOperation.java:147) at io.fabric8.kubernetes.client.dsl.base.BaseOperation.list(BaseOperation.java:614) at io.fabric8.kubernetes.client.dsl.base.BaseOperation.list(BaseOperation.java:63) at org.csanchez.jenkins.plugins.kubernetes.KubernetesCloud.getActiveSlavePods(KubernetesCloud.java:581) at org.csanchez.jenkins.plugins.kubernetes.KubernetesCloud.addProvisionedSlave(KubernetesCloud.java:556) at org.csanchez.jenkins.plugins.kubernetes.KubernetesCloud.provision(KubernetesCloud.java:508) at hudson.slaves.NodeProvisioner$StandardStrategyImpl.apply(NodeProvisioner.java:729) at hudson.slaves.NodeProvisioner.update(NodeProvisioner.java:332) at hudson.slaves.NodeProvisioner.access$900(NodeProvisioner.java:63) at hudson.slaves.NodeProvisioner$NodeProvisionerInvoker.doRun(NodeProvisioner.java:823) at hudson.triggers.SafeTimerTask.run(SafeTimerTask.java:72) at jenkins.security.ImpersonatingScheduledExecutorService$1.run(ImpersonatingScheduledExecutorService.java:58) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)

          When I click 'test connection' in management I see a successful test and this:

          Oct 23, 2019 10:48:40 AM FINE org.csanchez.jenkins.plugins.kubernetes.KubernetesFactoryAdapter
          Configuring Kubernetes client from kubeconfig file
          
          Oct 23, 2019 10:48:40 AM FINE org.csanchez.jenkins.plugins.kubernetes.KubernetesFactoryAdapter
          Creating Kubernetes client: KubernetesFactoryAdapter [serviceAddress=https://primary-aks-eurw-dev-wr-aasda324.hcp.westeurope.azmk8s.io, namespace=, caCertData=null, credentials=org.jenkinsci.plugins.plaincredentials.impl.FileCredentialsImpl@c615694a, skipTlsVerify=false, connectTimeout=0, readTimeout=0]

          Andrew Cameron-Douglas added a comment - When I click 'test connection' in management I see a successful test and this: Oct 23, 2019 10:48:40 AM FINE org.csanchez.jenkins.plugins.kubernetes.KubernetesFactoryAdapter Configuring Kubernetes client from kubeconfig file Oct 23, 2019 10:48:40 AM FINE org.csanchez.jenkins.plugins.kubernetes.KubernetesFactoryAdapter Creating Kubernetes client: KubernetesFactoryAdapter [serviceAddress=https://primary-aks-eurw-dev-wr-aasda324.hcp.westeurope.azmk8s.io, namespace=, caCertData=null, credentials=org.jenkinsci.plugins.plaincredentials.impl.FileCredentialsImpl@c615694a, skipTlsVerify=false, connectTimeout=0, readTimeout=0]

            Unassigned Unassigned
            adouglas_wr Andrew Cameron-Douglas
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated: