k8s plugin sometime swaps the pod name with the node name

XMLWordPrintable

      We are seeing about 10-60 crashed jobs per day with the client seeming to have mixed up the node/pod names in its invocation which leads to us getting 404's from the api server.

      It begins by us launching a job which then sends out a pod template then once everything has started up. Sometime we will be running for a while before a shell call start to fall over or sometime it starts falling over right away.

      It attempts to send shell command to the pod then due to it being the wrong pod name (node name) it fails.

      [2021-11-30T18:45:23.979Z] [Pipeline] // node
      [2021-11-30T18:45:23.987Z] [Pipeline] }
      [2021-11-30T18:45:23.991Z] 
      [2021-11-30T18:45:24.003Z] [Pipeline] // ansiColor
      [2021-11-30T18:45:24.012Z] [Pipeline] }
      [2021-11-30T18:45:24.028Z] [Pipeline] // stage
      [2021-11-30T18:45:24.040Z] [Pipeline] }
      [2021-11-30T18:45:24.054Z] [Pipeline] // withEnv
      [2021-11-30T18:45:24.072Z] [Pipeline] echo
      
      [2021-11-30T18:45:24.074Z] [31m[1mAlso:   java.lang.Throwable: waiting here
      [2021-11-30T18:45:24.074Z] 	at io.fabric8.kubernetes.client.utils.Utils.waitUntilReady(Utils.java:151)
      [2021-11-30T18:45:24.074Z] 	at io.fabric8.kubernetes.client.dsl.internal.ExecWebSocketListener.waitUntilReady(ExecWebSocketListener.java:188)
      [2021-11-30T18:45:24.074Z] 	at io.fabric8.kubernetes.client.dsl.internal.core.v1.PodOperationsImpl.exec(PodOperationsImpl.java:331)
      [2021-11-30T18:45:24.074Z] 	at io.fabric8.kubernetes.client.dsl.internal.core.v1.PodOperationsImpl.exec(PodOperationsImpl.java:86)
      [2021-11-30T18:45:24.074Z] 	at org.csanchez.jenkins.plugins.kubernetes.pipeline.ContainerExecDecorator$1.doLaunch(ContainerExecDecorator.java:421)
      [2021-11-30T18:45:24.074Z] 	at org.csanchez.jenkins.plugins.kubernetes.pipeline.ContainerExecDecorator$1.launch(ContainerExecDecorator.java:338)
      [2021-11-30T18:45:24.074Z] 	at hudson.Launcher$ProcStarter.start(Launcher.java:508)
      [2021-11-30T18:45:24.074Z] 	at org.jenkinsci.plugins.durabletask.BourneShellScript.launchWithCookie(BourneShellScript.java:176)
      [2021-11-30T18:45:24.074Z] 	at org.jenkinsci.plugins.durabletask.FileMonitoringTask.launch(FileMonitoringTask.java:136)
      [2021-11-30T18:45:24.074Z] 	at org.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep$Execution.start(DurableTaskStep.java:320)
      [2021-11-30T18:45:24.074Z] 	at org.jenkinsci.plugins.workflow.cps.DSL.invokeStep(DSL.java:319)
      [2021-11-30T18:45:24.074Z] 	at org.jenkinsci.plugins.workflow.cps.DSL.invokeMethod(DSL.java:193)
      [2021-11-30T18:45:24.074Z] 	at org.jenkinsci.plugins.workflow.cps.CpsScript.invokeMethod(CpsScript.java:122)
      [2021-11-30T18:45:24.074Z] 	at groovy.lang.MetaClassImpl.invokeMethodOnGroovyObject(MetaClassImpl.java:1278)
      [2021-11-30T18:45:24.074Z] 	at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1172)
      [2021-11-30T18:45:24.074Z] 	at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1022)
      [2021-11-30T18:45:24.074Z] 	at org.codehaus.groovy.runtime.callsite.PogoMetaClassSite.call(PogoMetaClassSite.java:42)
      [2021-11-30T18:45:24.074Z] 	at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:48)
      [2021-11-30T18:45:24.074Z] 	at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:113)
      [2021-11-30T18:45:24.074Z] 	at com.cloudbees.groovy.cps.sandbox.DefaultInvoker.methodCall(DefaultInvoker.java:20)
      [2021-11-30T18:45:24.074Z] 	at com.cloudbees.groovy.cps.impl.ContinuationGroup.methodCall(ContinuationGroup.java:86)
      [2021-11-30T18:45:24.074Z] 	at com.cloudbees.groovy.cps.impl.FunctionCallBlock$ContinuationImpl.dispatchOrArg(FunctionCallBlock.java:113)
      [2021-11-30T18:45:24.074Z] 	at com.cloudbees.groovy.cps.impl.FunctionCallBlock$ContinuationImpl.fixArg(FunctionCallBlock.java:83)
      [2021-11-30T18:45:24.074Z] 	at jdk.internal.reflect.GeneratedMethodAccessor143.invoke(Unknown Source)
      [2021-11-30T18:45:24.074Z] 	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      [2021-11-30T18:45:24.074Z] 	at java.base/java.lang.reflect.Method.invoke(Method.java:566)
      [2021-11-30T18:45:24.074Z] 	at com.cloudbees.groovy.cps.impl.ContinuationPtr$ContinuationImpl.receive(ContinuationPtr.java:72)
      [2021-11-30T18:45:24.074Z] 	at com.cloudbees.groovy.cps.impl.CollectionLiteralBlock$ContinuationImpl.dispatch(CollectionLiteralBlock.java:55)
      [2021-11-30T18:45:24.074Z] 	at com.cloudbees.groovy.cps.impl.CollectionLiteralBlock$ContinuationImpl.item(CollectionLiteralBlock.java:45)
      [2021-11-30T18:45:24.074Z] 	at jdk.internal.reflect.GeneratedMethodAccessor193.invoke(Unknown Source)
      [2021-11-30T18:45:24.074Z] 	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      [2021-11-30T18:45:24.074Z] 	at java.base/java.lang.reflect.Method.invoke(Method.java:566)
      [2021-11-30T18:45:24.074Z] 	at com.cloudbees.groovy.cps.impl.ContinuationPtr$ContinuationImpl.receive(ContinuationPtr.java:72)
      [2021-11-30T18:45:24.074Z] 	at com.cloudbees.groovy.cps.impl.ConstantBlock.eval(ConstantBlock.java:21)
      [2021-11-30T18:45:24.074Z] 	at com.cloudbees.groovy.cps.Next.step(Next.java:83)
      [2021-11-30T18:45:24.074Z] 	at com.cloudbees.groovy.cps.Continuable$1.call(Continuable.java:174)
      [2021-11-30T18:45:24.074Z] 	at com.cloudbees.groovy.cps.Continuable$1.call(Continuable.java:163)
      [2021-11-30T18:45:24.074Z] 	at org.codehaus.groovy.runtime.GroovyCategorySupport$ThreadCategoryInfo.use(GroovyCategorySupport.java:129)
      [2021-11-30T18:45:24.074Z] 	at org.codehaus.groovy.runtime.GroovyCategorySupport.use(GroovyCategorySupport.java:268)
      [2021-11-30T18:45:24.074Z] 	at com.cloudbees.groovy.cps.Continuable.run0(Continuable.java:163)
      [2021-11-30T18:45:24.074Z] 	at org.jenkinsci.plugins.workflow.cps.SandboxContinuable.access$001(SandboxContinuable.java:18)
      [2021-11-30T18:45:24.074Z] 	at org.jenkinsci.plugins.workflow.cps.SandboxContinuable.run0(SandboxContinuable.java:51)
      [2021-11-30T18:45:24.074Z] 	at org.jenkinsci.plugins.workflow.cps.CpsThread.runNextChunk(CpsThread.java:185)
      [2021-11-30T18:45:24.074Z] 	at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.run(CpsThreadGroup.java:400)
      [2021-11-30T18:45:24.074Z] 	at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.access$400(CpsThreadGroup.java:96)
      [2021-11-30T18:45:24.074Z] 	at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:312)
      [2021-11-30T18:45:24.074Z] 	at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:276)
      [2021-11-30T18:45:24.074Z] 	at org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService$2.call(CpsVmExecutorService.java:67)
      [2021-11-30T18:45:24.074Z] 	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
      [2021-11-30T18:45:24.074Z] 	at hudson.remoting.SingleLaneExecutorService$1.run(SingleLaneExecutorService.java:139)
      [2021-11-30T18:45:24.074Z] 	at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
      [2021-11-30T18:45:24.074Z] 	at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:68)
      [2021-11-30T18:45:24.074Z] 	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
      [2021-11-30T18:45:24.074Z] 	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
      [2021-11-30T18:45:24.074Z] 	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
      [2021-11-30T18:45:24.074Z] 	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
      [2021-11-30T18:45:24.074Z] 	at java.base/java.lang.Thread.run(Thread.java:834)
      
      [2021-11-30T18:45:24.074Z] io.fabric8.kubernetes.client.KubernetesClientException: pods "ip-*-*-*-*.us-west-2.compute.internal" not found
      [2021-11-30T18:45:24.074Z] 	at io.fabric8.kubernetes.client.dsl.internal.ExecWebSocketListener.onFailure(ExecWebSocketListener.java:239)
      [2021-11-30T18:45:24.074Z] 	at okhttp3.internal.ws.RealWebSocket.failWebSocket(RealWebSocket.java:571)
      [2021-11-30T18:45:24.074Z] 	at okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:198)
      [2021-11-30T18:45:24.074Z] 	at okhttp3.RealCall$AsyncCall.execute(RealCall.java:203)
      [2021-11-30T18:45:24.074Z] 	at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32)
      [2021-11-30T18:45:24.074Z] 	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
      [2021-11-30T18:45:24.074Z] 	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
      [2021-11-30T18:45:24.074Z] 	at java.base/java.lang.Thread.run(Thread.java:834)
      [2021-11-30T18:45:24.074Z] [0m
      

       

      The main crux of it is this line here:

      io.fabric8.kubernetes.client.KubernetesClientException: pods "ip-*-*-*-*.us-west-2.compute.internal" not found 

       

      I've censored the actual ip address, but this is a node name not a pod name so that indicates to me that something is mixing the two up in the plugin.

            Assignee:
            Kyle Cronin
            Reporter:
            Anja Berens
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated: