Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-39867

Message: Unauthorized from OpenShift API after some time

      Somehow, after undeterminated time, the kubernetes-plugin from Jenkins can't connect to the Openshift server with this error in the Jenkins log:

      Nov 15, 2016 11:44:13 AM INFO org.csanchez.jenkins.plugins.kubernetes.KubernetesCloud provision
      Excess workload after pending Spot instances: 1
      Nov 15, 2016 11:44:13 AM WARNING org.csanchez.jenkins.plugins.kubernetes.KubernetesCloud provision
      Failed to count the # of live instances on Kubernetes
      io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: GET at: https://openshift.server/api/v1/namespaces/jenkins/pods?labelSelector=jenkins%3Dslave. Message: Unauthorized
      .
      	at io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:310)
      	at io.fabric8.kubernetes.client.dsl.base.OperationSupport.assertResponseCode(OperationSupport.java:263)
      	at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:232)
      	at io.fabric8.kubernetes.client.dsl.base.BaseOperation.list(BaseOperation.java:416)
      	at io.fabric8.kubernetes.client.dsl.base.BaseOperation.list(BaseOperation.java:58)
      	at org.csanchez.jenkins.plugins.kubernetes.KubernetesCloud.addProvisionedSlave(KubernetesCloud.java:588)
      	at org.csanchez.jenkins.plugins.kubernetes.KubernetesCloud.provision(KubernetesCloud.java:463)
      	at hudson.slaves.NodeProvisioner$StandardStrategyImpl.apply(NodeProvisioner.java:701)
      	at hudson.slaves.NodeProvisioner.update(NodeProvisioner.java:307)
      	at hudson.slaves.NodeProvisioner.access$000(NodeProvisioner.java:60)
      	at hudson.slaves.NodeProvisioner$NodeProvisionerInvoker.doRun(NodeProvisioner.java:798)
      	at hudson.triggers.SafeTimerTask.run(SafeTimerTask.java:50)
      	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
      	at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
      	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
      	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
      	at java.lang.Thread.run(Thread.java:745)
      

      And clicking the "Test Connection" in the plugin interface replies a "Connection successful" but generates the following in the jenkins log:

      Nov 15, 2016 1:01:16 PM WARNING io.fabric8.kubernetes.client.Config tryServiceAccount
      Error reading service account token from: [/var/run/secrets/kubernetes.io/serviceaccount/token]. Ignoring.
      Nov 15, 2016 1:01:16 PM WARNING org.apache.http.client.protocol.ResponseProcessCookies processCookies
      Invalid cookie header: "Set-Cookie: ssn=MTQ3OTIxMTI3NnxKWjlkclhMOTJhNlU2MFZXdWVhdkdkQTBiVlloT3lMLWZDQkpERE4wdThsVGNkd2h6LVVmTzJvWWw0YUUyWlBqNEJHRlg3cHdIUzJVM2Q3SC1VTFZDY1BMNzVMUXJwOUk5QlJ6blBlQkhnMEZRMFEyZW5pdkxjcEdBbzdwd0dBQ2ZnPT1830DMNv2oQAZm6jBMgQ7RiFXwKY6UAJ7OBXixgk4kPoQ=; Path=/; Expires=Tue, 15 Nov 2016 13:01:16 GMT; Max-Age=3600; HttpOnly; Secure". Invalid 'expires' attribute: Tue, 15 Nov 2016 13:01:16 GMT
      

      Restarting the Jenkins server this problem has fixed until it happen again after some time.

          [JENKINS-39867] Message: Unauthorized from OpenShift API after some time

          Albert V added a comment -

          Any idea to how can we fix this issue? csanchez or iocanel??

          Albert V added a comment - Any idea to how can we fix this issue? csanchez or iocanel ??

          m4x1m0v3r Can you please tell me what you mean by "suddendly"? What I get from it is that it worked, but at some point it stopped.

          Also can you please paste the output of:

          oc adm policy who-can list pods -n jenkins

          Ioannis Canellos added a comment - m4x1m0v3r Can you please tell me what you mean by "suddendly"? What I get from it is that it worked, but at some point it stopped. Also can you please paste the output of: oc adm policy who-can list pods -n jenkins

          Albert V added a comment -

          I've used the term "suddenly" because after the plugin update everything had working well but without any reason (or the same one than causes this error in the 0.9 plugin version) this happened again. I could not find any pattern to reproduce this error.
          The output that you asked is this:

          $ oc adm policy who-can list pods -n jenkins
          Namespace: jenkins
          Verb:      list
          Resource:  pods
          
          Users:  myUser1
                  myUser2
                  system:admin
                  system:serviceaccount:jenkins:deployer
                  system:serviceaccount:management-infra:management-admin
                  system:serviceaccount:openshift-infra:build-controller
                  system:serviceaccount:openshift-infra:daemonset-controller
                  system:serviceaccount:openshift-infra:deployment-controller
                  system:serviceaccount:openshift-infra:deploymentconfig-controller
                  system:serviceaccount:openshift-infra:endpoint-controller
                  system:serviceaccount:openshift-infra:gc-controller
                  system:serviceaccount:openshift-infra:hpa-controller
                  system:serviceaccount:openshift-infra:job-controller
                  system:serviceaccount:openshift-infra:namespace-controller
                  system:serviceaccount:openshift-infra:pet-set-controller
                  system:serviceaccount:openshift-infra:pv-attach-detach-controller
                  system:serviceaccount:openshift-infra:pv-binder-controller
                  system:serviceaccount:openshift-infra:pv-recycler-controller
                  system:serviceaccount:openshift-infra:replicaset-controller
                  system:serviceaccount:openshift-infra:replication-controller
          
          Groups: GL_APP_OSHIFT_Admins
                  system:cluster-admins
                  system:cluster-readers
                  system:masters
                  system:nodes
          
          

          Restarting the jenkins isntance it still repair this error.

          Thank you for helping me with this

          Albert V added a comment - I've used the term "suddenly" because after the plugin update everything had working well but without any reason (or the same one than causes this error in the 0.9 plugin version) this happened again. I could not find any pattern to reproduce this error. The output that you asked is this: $ oc adm policy who-can list pods -n jenkins Namespace: jenkins Verb: list Resource: pods Users: myUser1 myUser2 system:admin system:serviceaccount:jenkins:deployer system:serviceaccount:management-infra:management-admin system:serviceaccount:openshift-infra:build-controller system:serviceaccount:openshift-infra:daemonset-controller system:serviceaccount:openshift-infra:deployment-controller system:serviceaccount:openshift-infra:deploymentconfig-controller system:serviceaccount:openshift-infra:endpoint-controller system:serviceaccount:openshift-infra:gc-controller system:serviceaccount:openshift-infra:hpa-controller system:serviceaccount:openshift-infra:job-controller system:serviceaccount:openshift-infra:namespace-controller system:serviceaccount:openshift-infra:pet-set-controller system:serviceaccount:openshift-infra:pv-attach-detach-controller system:serviceaccount:openshift-infra:pv-binder-controller system:serviceaccount:openshift-infra:pv-recycler-controller system:serviceaccount:openshift-infra:replicaset-controller system:serviceaccount:openshift-infra:replication-controller Groups: GL_APP_OSHIFT_Admins system:cluster-admins system:cluster-readers system:masters system:nodes Restarting the jenkins isntance it still repair this error. Thank you for helping me with this

          So I am assuming that you are either connection with myUser1, myUser2 or admin, right?

          Could it be possible that each time jenkins starts, it authenticates but after 24 hours, the token expires and you start getting these errors until, you restart again?
          If this is the case, then we need to make sure in the kubernetes-client that we handle properlly token expiration.

          Ioannis Canellos added a comment - So I am assuming that you are either connection with myUser1, myUser2 or admin, right? Could it be possible that each time jenkins starts, it authenticates but after 24 hours, the token expires and you start getting these errors until, you restart again? If this is the case, then we need to make sure in the kubernetes-client that we handle properlly token expiration.

          Albert V added a comment -

          Yes, I'm using one to those myUser1 user.

          It seems that could be a token expiration problem because (I can't tell you surely) this error happened once per day.

          Albert V added a comment - Yes, I'm using one to those myUser1 user. It seems that could be a token expiration problem because (I can't tell you surely) this error happened once per day.

          Albert V added a comment -

          Is there any update of this? It is still happening even in the latest (0.11) version.

          Thanks!

          Albert V added a comment - Is there any update of this? It is still happening even in the latest (0.11) version. Thanks!

          Lloyd Fernandes added a comment - - edited

           have the same issue with 0.11. I have tracked it to the following in my case.

          org.csanchez.jenkins.plugins.kubernetes.KubernetesCloud.connect() method is caching the client connection. If the credentials were not for a long lived service account , the token would expire after some time and you would get an authorization failure

           Below is snippet I modified so it would reset in one hour. 

          private static final long CONNECTION_RESET = 60 * 60 * 1000;

          private long savedTime = 0;

          public KubernetesClient connect()
          throws UnrecoverableKeyException, NoSuchAlgorithmException,
          KeyStoreException, IOException, CertificateEncodingException {

          LOGGER.log(Level.FINE, "Building connection to Kubernetes host "
          + name + " URL " + serverUrl);
          final long curTime = System.currentTimeMillis();
          if ((client == null)

          (curTime > (savedTime + CONNECTION_RESET))) {
          client = null;
          savedTime = curTime;
          synchronized (this) {
          if (client == null) {
          LOGGER.log(Level.INFO,
          "Attempting: Building connection to Kubernetes host "
          + name + " URL " + serverUrl
          + " Namespace " + namespace);
          client = new KubernetesFactoryAdapter(serverUrl,
          namespace, serverCertificate, credentialsId,
          skipTlsVerify, connectTimeout, readTimeout)
          .createClient();
          LOGGER.log(Level.INFO,
          "Success: Building connection to Kubernetes host "
          + name + " URL " + serverUrl
          + " Namespace " + namespace);

          }
          }
          }
          return client;

           

           

           

          Lloyd Fernandes added a comment - - edited  have the same issue with 0.11. I have tracked it to the following in my case. org.csanchez.jenkins.plugins.kubernetes.KubernetesCloud.connect() method is caching the client connection. If the credentials were not for a long lived service account , the token would expire after some time and you would get an authorization failure  Below is snippet I modified so it would reset in one hour.  private static final long CONNECTION_RESET = 60 * 60 * 1000; private long savedTime = 0; public KubernetesClient connect() throws UnrecoverableKeyException, NoSuchAlgorithmException, KeyStoreException, IOException, CertificateEncodingException { LOGGER.log(Level.FINE, "Building connection to Kubernetes host " + name + " URL " + serverUrl); final long curTime = System.currentTimeMillis(); if ((client == null) (curTime > (savedTime + CONNECTION_RESET))) { client = null; savedTime = curTime; synchronized (this) { if (client == null) { LOGGER.log(Level.INFO, "Attempting: Building connection to Kubernetes host " + name + " URL " + serverUrl + " Namespace " + namespace); client = new KubernetesFactoryAdapter(serverUrl, namespace, serverCertificate, credentialsId, skipTlsVerify, connectTimeout, readTimeout) .createClient(); LOGGER.log(Level.INFO, "Success: Building connection to Kubernetes host " + name + " URL " + serverUrl + " Namespace " + namespace); } } } return client;      

          Albert V added a comment -

          I've fixed it setting a new credentials on Jenkins as Service Account Token instead of a user.
          Then, this is not a defect of this plugin.

          Thank you anyway for your help

          Albert V added a comment - I've fixed it setting a new credentials on Jenkins as Service Account Token instead of a user. Then, this is not a defect of this plugin. Thank you anyway for your help

          I have the same issue when upgrading from 0.10 to 0.12.

           

          Steps to reproduce:

          • install Jenkins persistent in openshift
            in my case OpenShift Master: v1.5.1+7b451fc Kubernetes Master: v1.5.2+43a9be4 
          • simple pipeline job run successfully:
          node("maven") {
            echo "hello"
          }
          • update Kubernetes plugin from 0.10 to 0.12
            following plugins get updated as well:
            • Pipeline: Model API
            • Script Security Plugin
            • Pipeline: Supporting APIs
            • Pipeline: Step API
            • Pipeline: API
            • Pipeline: Groovy
            • Pipeline: Job
            • Pipeline: Declarative Extension Points API
            • Pipeline: Nodes and Processes
          • simple pipeline job doesn't run anymore (just hanging)
            warnings appear in the log:
          WARNING: Failed to count the # of live instances on Kubernetes
          io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: GET at: https://172.30.0.1/api/v1/pods?labelSelector=jenkins%3Dslave. Message: Forbidden!Configured service account doesn't have access. Service account may have been revoked. User "system:serviceaccount:mg-test:jenkins" cannot list all pods in the cluster.
          

          The job runs successfully when downgrading the Kubernetes plugin to 0.10 (and this is the only plugin that gets downgraded).

          Marc Guillemot added a comment - I have the same issue when upgrading from 0.10 to 0.12.   Steps to reproduce: install Jenkins persistent in openshift in my case OpenShift Master: v1.5.1+7b451fc Kubernetes Master: v1.5.2+43a9be4  simple pipeline job run successfully: node( "maven" ) { echo "hello" } update Kubernetes plugin from 0.10 to 0.12 following plugins get updated as well: Pipeline: Model API Script Security Plugin Pipeline: Supporting APIs Pipeline: Step API Pipeline: API Pipeline: Groovy Pipeline: Job Pipeline: Declarative Extension Points API Pipeline: Nodes and Processes simple pipeline job doesn't run anymore (just hanging) warnings appear in the log: WARNING: Failed to count the # of live instances on Kubernetes io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: GET at: https: //172.30.0.1/api/v1/pods?labelSelector=jenkins%3Dslave. Message: Forbidden!Configured service account doesn't have access. Service account may have been revoked. User "system:serviceaccount:mg-test:jenkins" cannot list all pods in the cluster. The job runs successfully when downgrading the Kubernetes plugin to 0.10 (and this is the only plugin that gets downgraded).

          Carlos Sanchez added a comment - - edited

          Fixed in https://github.com/jenkinsci/kubernetes-plugin/pull/189 removing the caching of the kubernetes client

          Carlos Sanchez added a comment - - edited Fixed in https://github.com/jenkinsci/kubernetes-plugin/pull/189 removing the caching of the kubernetes client

            csanchez Carlos Sanchez
            m4x1m0v3r Albert V
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

              Created:
              Updated:
              Resolved: