Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-56140

Failed to count the # of live instances on Kubernetes because of expired bearer token

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Major Major
    • kubernetes-plugin
    • None
    • Jenkins ver. 2.150.2
      kubernetes-plugin 1.14.3
      OpenShift Master:
          v3.9.30
      Kubernetes Master:
          v1.9.1+a0ce1bc657
    • 1.14.5

      After upgrade to 1.14.3 from 1.12.7 everything was working fine but after 24 hours no new slave pods were created and following exception was visible in the log:

      Feb 11, 2019 3:53:01 PM okhttp3.internal.platform.Platform log
      INFO: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"Unauthorized","reason":"Unauthorized","code":401}

      Feb 11, 2019 3:53:01 PM okhttp3.internal.platform.Platform log
      INFO: <-- END HTTP (129-byte body)
      Feb 11, 2019 3:53:01 PM org.csanchez.jenkins.plugins.kubernetes.KubernetesCloud provision
      WARNING: Failed to count the # of live instances on Kubernetes
      io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: GET at: https://b22.jonqe.lab.eng.bos.redhat.com:8443/api/v1/namespaces/jenkins-slaves/pods?labelSelector=jenkins%3Dslave. Message: Unauthorized. Received status: Status(apiVersion=v1, code=401, details=null, kind=Status, message=Unauthorized, metadata=ListMeta(_continue=null, resourceVersion=null, selfLink=null, additionalProperties={}), reason=Unauthorized, status=Failure, additionalProperties={}).
      at io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:478)
      at io.fabric8.kubernetes.client.dsl.base.OperationSupport.assertResponseCode(OperationSupport.java:417)
      at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:381)
      at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:344)
      at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:328)
      at io.fabric8.kubernetes.client.dsl.base.BaseOperation.listRequestHelper(BaseOperation.java:193)
      at io.fabric8.kubernetes.client.dsl.base.BaseOperation.list(BaseOperation.java:618)
      at io.fabric8.kubernetes.client.dsl.base.BaseOperation.list(BaseOperation.java:68)
      at org.csanchez.jenkins.plugins.kubernetes.KubernetesCloud.addProvisionedSlave(KubernetesCloud.java:505)
      at org.csanchez.jenkins.plugins.kubernetes.KubernetesCloud.provision(KubernetesCloud.java:458)
      at hudson.slaves.NodeProvisioner$StandardStrategyImpl.apply(NodeProvisioner.java:715)
      at hudson.slaves.NodeProvisioner.update(NodeProvisioner.java:320)
      at hudson.slaves.NodeProvisioner.access$000(NodeProvisioner.java:61)
      at hudson.slaves.NodeProvisioner$NodeProvisionerInvoker.doRun(NodeProvisioner.java:809)
      at hudson.triggers.SafeTimerTask.run(SafeTimerTask.java:72)
      at jenkins.security.ImpersonatingScheduledExecutorService$1.run(ImpersonatingScheduledExecutorService.java:58)
      at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
      at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
      at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
      at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
      at java.lang.Thread.run(Thread.java:748)

       

       

      When the jenkins master is restarted, everything is working fine again but after 24 hours it's failing again. Following is visible in openshift log:

      Feb 14 07:23:45 b22 atomic-openshift-master-api: E0214 07:23:45.537998 2177 authentication.go:64] Unable to authenticate the request due to an error: [invalid bearer token, [invalid bearer token, oauthaccesstokens.oauth.openshift.io "7BM2kmQ6wu8GZx9vOFVupO8W-a5Wc9Unf2ltJtogg2c" not found]]

       

      It seems that new version of plugin is using one client with one token which expires by default in 24 hours (OS config: accessTokenMaxAgeSeconds: 86400). When the jenkins is restarted, new client with new token is created and it's working for another 24 hours and then it expires again.

       

      It's possible to increase accessTokenMaxAgeSeconds on OS side but it's not a good workaround. It still requires to restart jenkins when the token expires.

          [JENKINS-56140] Failed to count the # of live instances on Kubernetes because of expired bearer token

          Carlos Sanchez added a comment - see https://github.com/jenkinsci/kubernetes-plugin/pull/429

          Filip Brychta added a comment -

          Thanks a lot for quick response. IIUC the PR#429  forces created clients to be flushed from the cache by default after 24 hours and newly created clients should get new token. This looks like a good solution to me.

          Thank you

          Filip Brychta added a comment - Thanks a lot for quick response. IIUC the PR#429  forces created clients to be flushed from the cache by default after 24 hours and newly created clients should get new token. This looks like a good solution to me. Thank you

            csanchez Carlos Sanchez
            fbrychta Filip Brychta
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: