Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-42048

Cannot Connect, PID NumberFormatException

    XMLWordPrintable

    Details

    • Similar Issues:

      Description

      I tried to narrow this bug down, but there isn't much information. We just upgraded to all newest plugins, but unfortunately we upgraded a lot at once, so no idea which one.

      This is spamming out logs every few seconds:

      00:19:42.695 Cannot contact kubernetes-ef39fe82c8a541be84bd780e4d7c1ddb-ce4d47fc96bfc: java.io.IOException: corrupted content in /home/jenkins/workspace/Robusta_robusta_develop-6EPNQBJK5BYEXOJV6L45MMZZGUIP7WO4Y6EGRUYNFFMRC7B2GL3A@tmp/durable-96fa79b7/pid: java.lang.NumberFormatException: For input string: ""
      00:19:57.758 Cannot contact kubernetes-ef39fe82c8a541be84bd780e4d7c1ddb-ce4d47fc96bfc: java.io.IOException: corrupted content in /home/jenkins/workspace/Robusta_robusta_develop-6EPNQBJK5BYEXOJV6L45MMZZGUIP7WO4Y6EGRUYNFFMRC7B2GL3A@tmp/durable-96fa79b7/pid: java.lang.NumberFormatException: For input string: ""
      00:20:12.769 Cannot contact kubernetes-ef39fe82c8a541be84bd780e4d7c1ddb-ce4d47fc96bfc: java.io.IOException: corrupted content in /home/jenkins/workspace/Robusta_robusta_develop-6EPNQBJK5BYEXOJV6L45MMZZGUIP7WO4Y6EGRUYNFFMRC7B2GL3A@tmp/durable-96fa79b7/pid: java.lang.NumberFormatException: For input string: ""
      

      Add information from comments:

      The same kind of problem has occurred in my Jenkins running on GKE. This problem occurred by upgrading Pipeline Nodes and Processes Plugin from 2.8 to 2.9. I am confirming that this problem temporarily resolves by downgrading that plugin from 2.9 to 2.8.

      BTW workflow-durable-task-step 2.9 does add this log message but it is just exposing a problem that was already there, and simply being suppressed unless you were running a sufficiently fine logger. The problem is that this code is seeing a file which is supposed to contain a number once created, whereas it is being created as empty for some reason.

      Again the bug probably exists in all versions, it is only printed to the build log as of 2.9. You can add a FINE logger to org.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep to verify.

        Attachments

          Issue Links

            Activity

            larslawoko Lars Lawoko created issue -
            Hide
            iocanel Ioannis Canellos added a comment -

            It seems that you pod is created but for some reason can't read the pid, which is something I've never seen before.

            Can you please tell us how your pipeline looks like and maybe get us the output of `kubectl describe` for the kubernetes-ef39fe82c8a541be84bd780e4d7c1ddb-ce4d47fc96bfc pod?

            Show
            iocanel Ioannis Canellos added a comment - It seems that you pod is created but for some reason can't read the pid, which is something I've never seen before. Can you please tell us how your pipeline looks like and maybe get us the output of `kubectl describe` for the kubernetes-ef39fe82c8a541be84bd780e4d7c1ddb-ce4d47fc96bfc pod?
            Hide
            jknurek J Knurek added a comment -

            I'm experiencing the same recurring error message

            the slave is using the golang docker image, and the pipeline is setup like this:

            podTemplate(label: 'jenkpod', containers: [
                containerTemplate(name: 'golang', image: 'golang:1.8', ttyEnabled: true, command: 'cat')
              ]) {
              node ('jenkpod') { container('golang') {
                stage('Pre-Build') {
                    checkout scm
                    sh 'make get'
                }
              } }
            }
            

            the events for the slave pod:

            Events:
              FirstSeen	LastSeen	Count	From								SubObjectPath		Type		Reason		Message
              ---------	--------	-----	----								-------------		--------	------		-------
              13m		13m		1	{default-scheduler }									Normal		Scheduled	Successfully assigned kubernetes-f1e4a27973a941c2af08bebbc74cc080-10bf4e8527c4 to gke-jenkins
              13m		13m		1	{kubelet gke-jenkins}	spec.containers{golang}	Normal		Pulled		Container image "golang:1.8" already present on machine
              13m		13m		1	{kubelet gke-jenkins}	spec.containers{golang}	Normal		Created		Created container with docker id 97e4b71e323e; Security:[seccomp=unconfined]
              13m		13m		1	{kubelet gke-jenkins}	spec.containers{golang}	Normal		Started		Started container with docker id 97e4b71e323e
              13m		13m		1	{kubelet gke-jenkins}	spec.containers{jnlp}	Normal		Pulled		Container image "jenkinsci/jnlp-slave:alpine" already present on machine
              13m		13m		1	{kubelet gke-jenkins}	spec.containers{jnlp}	Normal		Created		Created container with docker id 628623d03379; Security:[seccomp=unconfined]
              13m		13m		1	{kubelet gke-jenkins}	spec.containers{jnlp}	Normal		Started		Started container with docker id 628623d03379
            
            Show
            jknurek J Knurek added a comment - I'm experiencing the same recurring error message the slave is using the golang docker image, and the pipeline is setup like this: podTemplate(label: 'jenkpod' , containers: [ containerTemplate(name: 'golang' , image: 'golang:1.8' , ttyEnabled: true , command: 'cat' ) ]) { node ( 'jenkpod' ) { container( 'golang' ) { stage( 'Pre-Build' ) { checkout scm sh 'make get' } } } } the events for the slave pod: Events: FirstSeen LastSeen Count From SubObjectPath Type Reason Message --------- -------- ----- ---- ------------- -------- ------ ------- 13m 13m 1 { default -scheduler } Normal Scheduled Successfully assigned kubernetes-f1e4a27973a941c2af08bebbc74cc080-10bf4e8527c4 to gke-jenkins 13m 13m 1 {kubelet gke-jenkins} spec.containers{golang} Normal Pulled Container image "golang:1.8" already present on machine 13m 13m 1 {kubelet gke-jenkins} spec.containers{golang} Normal Created Created container with docker id 97e4b71e323e; Security:[seccomp=unconfined] 13m 13m 1 {kubelet gke-jenkins} spec.containers{golang} Normal Started Started container with docker id 97e4b71e323e 13m 13m 1 {kubelet gke-jenkins} spec.containers{jnlp} Normal Pulled Container image "jenkinsci/jnlp-slave:alpine" already present on machine 13m 13m 1 {kubelet gke-jenkins} spec.containers{jnlp} Normal Created Created container with docker id 628623d03379; Security:[seccomp=unconfined] 13m 13m 1 {kubelet gke-jenkins} spec.containers{jnlp} Normal Started Started container with docker id 628623d03379
            Hide
            larslawoko Lars Lawoko added a comment - - edited

            For more context we are running on the Google container engine (hosted k8), The weird thing is that it seems to be working right, i.e the pipeline builds, even with the constant exception.

            The exception starts once the gradle shell is run:

            [Pipeline] }
            [Pipeline] // stage
            [Pipeline] stage
            [Pipeline] { (Build & run unit tests)
            [Pipeline] withEnv
            [Pipeline] {
            [Pipeline] sh
            00:01:05.567 [Robusta_robusta_develop-6EPNQBJK5BYEXOJV6L45MMZZGUIP7WO4Y6EGRUYNFFMRC7B2GL3A] Running shell script
            00:01:05.572 Executing shell script inside container [gcloud-jdk7] of pod [kubernetes-9f544f8d984342c8bfa152fd3134608b-d1fdf7ba230b9]
            00:01:05.653 Executing command: sh -c echo $$ > '/home/jenkins/workspace/Robusta_robusta_develop-6EPNQBJK5BYEXOJV6L45MMZZGUIP7WO4Y6EGRUYNFFMRC7B2GL3A@tmp/durable-aa4bf913/pid'; jsc=durable-ca85172bfb8670e4c44f30557e14af18; JENKINS_SERVER_COOKIE=$jsc '/home/jenkins/workspace/Robusta_robusta_develop-6EPNQBJK5BYEXOJV6L45MMZZGUIP7WO4Y6EGRUYNFFMRC7B2GL3A@tmp/durable-aa4bf913/script.sh' > '/home/jenkins/workspace/Robusta_robusta_develop-6EPNQBJK5BYEXOJV6L45MMZZGUIP7WO4Y6EGRUYNFFMRC7B2GL3A@tmp/durable-aa4bf913/jenkins-log.txt' 2>&1; echo $? > '/home/jenkins/workspace/Robusta_robusta_develop-6EPNQBJK5BYEXOJV6L45MMZZGUIP7WO4Y6EGRUYNFFMRC7B2GL3A@tmp/durable-aa4bf913/jenkins-result.txt' 
            00:01:05.694 # cd /home/jenkins/workspace/Robusta_robusta_develop-6EPNQBJK5BYEXOJV6L45MMZZGUIP7WO4Y6EGRUYNFFMRC7B2GL3A
            00:01:05.694 sh -c echo $$ > '/home/jenkins/workspace/Robusta_robusta_develop-6EPNQBJK5BYEXOJV6L45MMZZGUIP7WO4Y6EGRUYNFFMRC7B2GL3A@tmp/durable-aa4bf913/pid'; jsc=durable-ca85172bfb8670e4c44f30557e14af18; JENKINS_SERVER_COOKIE=$jsc '/home/jenkins/workspace/Robusta_robusta_develop-6EPNQBJK5BYEXOJV6L45MMZZGUIP7WO4Y6EGRUYNFFMRC7B2GL3A@tmp/durable-aa4bf913/script.sh' > '/home/jenkins/workspace/Robusta_robusta_develop-6E# PNQBJK5BYEXOJV6L45MMZZGUIP7WO4Y6EGRUYNFFMRC7B2GL3A@tmp/durable-aa4bf913/jenkins-log.txt' 2>&1; echo $? > '/home/jenkins/workspace/Robusta_robusta_develop-6EPNQBJK5BYEXOJV6L45MMZZGUIP7WO4Y6EGRUYNFFMRC7B2GL3A@tmp/durable-aa4bf913/jenkins-result.txt' 
            00:01:05.694 exit
            00:01:05.909 + ./gradlew --stacktrace --parallel buildUnitTest
            00:01:05.909 Downloading https://services.gradle.org/distributions/gradle-3.3-all.zip
            00:01:05.995 Cannot contact kubernetes-9f544f8d984342c8bfa152fd3134608b-d1fdf7ba230b9: java.io.IOException: corrupted content in /home/jenkins/workspace/Robusta_robusta_develop-6EPNQBJK5BYEXOJV6L45MMZZGUIP7WO4Y6EGRUYNFFMRC7B2GL3A@tmp/durable-aa4bf913/pid: java.lang.NumberFormatException: For input string: ""
            

            Our jenkinsfile is pretty massive, but here is the core (has been edited):

                podTemplate(label: 'JavaPod', containers: [
                    containerTemplate(
                        name: 'gcloud-jdk7',
                        image: 'gcr.io/pc-infrastructure/robusta-jenkins-gcloud-jdk7',
                        ttyEnabled: true,
                        args: 'cat',
                        command: '/bin/sh -c',
                        alwaysPullImage: true,
                        workingDir: '/home/jenkins',
                        resourceRequestCpu: '2',
                        resourceRequestMemory: '8Gi',
                        resourceLimitCpu: '5',
                        resourceLimitMemory: '9Gi',
                    ),
                    containerTemplate(
                        name: 'jnlp',
                        image: 'jenkinsci/jnlp-slave:alpine',
                        args: '${computer.jnlpmac} ${computer.name}',
                        resourceRequestCpu: '100m',
                        resourceRequestMemory: '500Mi',
                        resourceLimitCpu: '500m',
                        resourceLimitMemory: '1Gi',
                    )
                ]) {
             node('JavaPod') {
                            container('gcloud-jdk7') {
                                timeout(30) { //assume something is wrong if it takes an half an hour
                                    stage('checkout source') {
                                        checkout scm
                                   }
                                   switch (env.BRANCH_NAME) {
                                        case 'develop':
                                            buildUnitTest()
                                            runIntegrationTests('local')
                                   }
                       }
            }
            
            void buildUnitTest() {
                stage('Build & run unit tests') {
                    withEnv(runEnv) {
                        try {
                            def command = './gradlew --stacktrace --parallel buildUnitTest'
                            if (env.BRANCH_NAME == 'master'){
                                command =  'export ROBUSTA_PROD_ANALYTICS=true && ' + command
                            }
                            sh command
                        } catch (Exception e) {
                            junit allowEmptyResults: true, testResults: '**/build/test-results/**/*.xml'
                            step([$class: 'CheckStylePublisher', canComputeNew: false, defaultEncoding: '', healthy: '', pattern: '**/main.xml,**/test.xml', unHealthy: ''])
                            throw e
                        }
                        junit allowEmptyResults: true, testResults: '**/build/test-results/**/*.xml'
                        step([$class: 'CheckStylePublisher', canComputeNew: false, defaultEncoding: '', healthy: '', pattern: '**/main.xml,**/test.xml', unHealthy: ''])
                    }
                }
            }
            
            void runIntegrationTests(String targetEnv) {
                stage('Run integration tests') {
                    withEnv(runEnv) {
                        try {
                            sh "./gradlew --stacktrace :robusta-integration-tests:integrationTest -PtestEnv=${targetEnv}"
                        } catch (Exception e) {
                            junit allowEmptyResults: true, testResults: '**/build/test-results/**/*.xml'
                            throw e
                        }
                        junit allowEmptyResults: true, testResults: '**/build/test-results/**/*.xml'
                    }
                }
            }
            
            
            kubectl describe pod kubernetes-9f544f8d984342c8bfa152fd3134608b-d1fdf7ba230b9
            Name:		kubernetes-9f544f8d984342c8bfa152fd3134608b-d1fdf7ba230b9
            Namespace:	default
            Node:		gke-pci-default-pool-44b03267-0ztl/10.140.0.2
            Start Time:	Thu, 16 Feb 2017 09:38:06 +1100
            Labels:		jenkins=slave
            		jenkins/JavaPod=true
            Status:		Running
            IP:		10.40.20.94
            Controllers:	<none>
            Containers:
              gcloud-jdk7:
                Container ID:	docker://28c3b60ae04d952cce366d5a03c2d950a171594828fd19446fe2aa9ed379dd33
                Image:		gcr.io/pc-infrastructure/robusta-jenkins-gcloud-jdk7
                Image ID:		docker://sha256:16e905ffe4f3393f6ee4b5125971a3029d6162ca0d3db5b2973f1f13b6201c3f
                Port:		
                Command:
                  /bin/sh
                  -c
                Args:
                  cat
                Limits:
                  cpu:	5
                  memory:	9Gi
                Requests:
                  cpu:		2
                  memory:		8Gi
                State:		Running
                  Started:		Thu, 16 Feb 2017 09:38:07 +1100
                Ready:		True
                Restart Count:	0
                Volume Mounts:
                  /home/jenkins from workspace-volume (rw)
                  /var/run/secrets/kubernetes.io/serviceaccount from default-token-y9hsd (ro)
                Environment Variables:
                  JENKINS_SECRET:		16b02724728739b72e0b559940adc7b5da29e9e190e8a35e858cece4bbc92346
                  JENKINS_NAME:		kubernetes-9f544f8d984342c8bfa152fd3134608b-d1fdf7ba230b9
                  JENKINS_LOCATION_URL:	https://build-robusta.papercut.software/
                  JENKINS_URL:		http://kubectl describe pod kubernetes-9f544f8d984342c8bfa152fd3134608b-d1fdf7ba230b9
            Name:		kubernetes-9f544f8d984342c8bfa152fd3134608b-d1fdf7ba230b9
            Namespace:	default
            Node:		gke-pci-default-pool-44b03267-0ztl/10.140.0.2
            Start Time:	Thu, 16 Feb 2017 09:38:06 +1100
            Labels:		jenkins=slave
            		jenkins/JavaPod=true
            Status:		Running
            IP:		10.40.20.94
            Controllers:	<none>
            Containers:
              gcloud-jdk7:
                Container ID:	docker://28c3b60ae04d952cce366d5a03c2d950a171594828fd19446fe2aa9ed379dd33
                Image:		gcr.io/pc-infrastructure/robusta-jenkins-gcloud-jdk7
                Image ID:		docker://sha256:16e905ffe4f3393f6ee4b5125971a3029d6162ca0d3db5b2973f1f13b6201c3f
                Port:		
                Command:
                  /bin/sh
                  -c
                Args:
                  cat
                Limits:
                  cpu:	5
                  memory:	9Gi
                Requests:
                  cpu:		2
                  memory:		8Gi
                State:		Running
                  Started:		Thu, 16 Feb 2017 09:38:07 +1100
                Ready:		True
                Restart Count:	0
                Volume Mounts:
                  /home/jenkins from workspace-volume (rw)
                  /var/run/secrets/kubernetes.io/serviceaccount from default-token-y9hsd (ro)
                Environment Variables:
                  JENKINS_SECRET:		16b02724728739b72e0b559940adc7b5da29e9e190e8a35e858cece4bbc92346
                  JENKINS_NAME:		kubernetes-9f544f8d984342c8bfa152fd3134608b-d1fdf7ba230b9
                  JENKINS_LOCATION_URL:	https://build-robusta.papercut.software/
                  JENKINS_URL:		http://build-robusta
                  JENKINS_JNLP_URL:		http://build-robusta/computer/kubernetes-9f544f8d984342c8bfa152fd3134608b-d1fdf7ba230b9/slave-agent.jnlp
                  HOME:			/home/jenkins
              jnlp:
                Container ID:	docker://b27bb762a03525763aee7d2a60a85b4c3331aa91c6ac7d40b40693f570c1b564
                Image:		jenkinsci/jnlp-slave:alpine
                Image ID:		docker://sha256:254fd665eaf0229f38295a9eac6c7f9bf32a2f450ecbcc8212f3e53b96dd339d
                Port:		
                Args:
                  16b02724728739b72e0b559940adc7b5da29e9e190e8a35e858cece4bbc92346
                  kubernetes-9f544f8d984342c8bfa152fd3134608b-d1fdf7ba230b9
                Limits:
                  cpu:	500m
                  memory:	1Gi
                Requests:
                  cpu:		100m
                  memory:		500Mi
                State:		Running
                  Started:		Thu, 16 Feb 2017 09:38:06 +1100
                Ready:		True
                Restart Count:	0
                Volume Mounts:
                  /home/jenkins from workspace-volume (rw)
                  /var/run/secrets/kubernetes.io/serviceaccount from default-token-y9hsd (ro)
                Environment Variables:
                  JENKINS_SECRET:		<secret>
                  JENKINS_NAME:		kubernetes-9f544f8d984342c8bfa152fd3134608b-d1fdf7ba230b9
                  JENKINS_LOCATION_URL:	<jenkins url> (I replaced)
                  JENKINS_URL:		<jenkins url> (I replaced)
                  JENKINS_JNLP_URL:		<jenkins url> (I replaced)/computer/kubernetes-9f544f8d984342c8bfa152fd3134608b-d1fdf7ba230b9/slave-agent.jnlp
                  HOME:			/home/jenkins
            Conditions:
              Type		Status
              Initialized 	True 
              Ready 	True 
              PodScheduled 	True 
            Volumes:
              workspace-volume:
                Type:	EmptyDir (a temporary directory that shares a pod's lifetime)
                Medium:	
              default-token-y9hsd:
                Type:	Secret (a volume populated by a Secret)
                SecretName:	default-token-y9hsd
            QoS Class:	Burstable
            Tolerations:	<none>
            Events:
              FirstSeen	LastSeen	Count	From						SubObjectPath			Type		Reason		Message
              ---------	--------	-----	----						-------------			--------	------		-------
              1m		1m		1	{default-scheduler }								Normal		Scheduled	Successfully assigned kubernetes-9f544f8d984342c8bfa152fd3134608b-d1fdf7ba230b9 to gke-pci-default-pool-44b03267-0ztl
              1m		1m		1	{kubelet gke-pci-default-pool-44b03267-0ztl}	spec.containers{jnlp}		Normal		Pulled		Container image "jenkinsci/jnlp-slave:alpine" already present on machine
              1m		1m		1	{kubelet gke-pci-default-pool-44b03267-0ztl}	spec.containers{jnlp}		Normal		Created		Created container with docker id b27bb762a035; Security:[seccomp=unconfined]
              1m		1m		1	{kubelet gke-pci-default-pool-44b03267-0ztl}	spec.containers{jnlp}		Normal		Started		Started container with docker id b27bb762a035
              1m		1m		1	{kubelet gke-pci-default-pool-44b03267-0ztl}	spec.containers{gcloud-jdk7}	Normal		Pulling		pulling image "gcr.io/pc-infrastructure/robusta-jenkins-gcloud-jdk7"
              1m		1m		1	{kubelet gke-pci-default-pool-44b03267-0ztl}	spec.containers{gcloud-jdk7}	Normal		Pulled		Successfully pulled image "gcr.io/pc-infrastructure/robusta-jenkins-gcloud-jdk7"
              1m		1m		1	{kubelet gke-pci-default-pool-44b03267-0ztl}	spec.containers{gcloud-jdk7}	Normal		Created		Created container with docker id 28c3b60ae04d; Security:[seccomp=unconfined]
              1m		1m		1	{kubelet gke-pci-default-pool-44b03267-0ztl}	spec.containers{gcloud-jdk7}	Normal		Started		Started container with docker id 28c3b60ae04d
                  JENKINS_JNLP_URL:		<jenkins url> (I replaced)/computer/kubernetes-9f544f8d984342c8bfa152fd3134608b-d1fdf7ba230b9/slave-agent.jnlp
                  HOME:			/home/jenkins
              jnlp:
                Container ID:	docker://b27bb762a03525763aee7d2a60a85b4c3331aa91c6ac7d40b40693f570c1b564
                Image:		jenkinsci/jnlp-slave:alpine
                Image ID:		docker://sha256:254fd665eaf0229f38295a9eac6c7f9bf32a2f450ecbcc8212f3e53b96dd339d
                Port:		
                Args:
                  16b02724728739b72e0b559940adc7b5da29e9e190e8a35e858cece4bbc92346
                  kubernetes-9f544f8d984342c8bfa152fd3134608b-d1fdf7ba230b9
                Limits:
                  cpu:	500m
                  memory:	1Gi
                Requests:
                  cpu:		100m
                  memory:		500Mi
                State:		Running
                  Started:		Thu, 16 Feb 2017 09:38:06 +1100
                Ready:		True
                Restart Count:	0
                Volume Mounts:
                  /home/jenkins from workspace-volume (rw)
                  /var/run/secrets/kubernetes.io/serviceaccount from default-token-y9hsd (ro)
                Environment Variables:
                  JENKINS_SECRET:		<secret>
                  JENKINS_NAME:		kubernetes-9f544f8d984342c8bfa152fd3134608b-d1fdf7ba230b9
                  JENKINS_LOCATION_URL:	<jenkins url> (I replaced)
                  JENKINS_URL:		<jenkins url> (I replaced)
                  JENKINS_JNLP_URL:		<jenkins url> (I replaced)/computer/kubernetes-9f544f8d984342c8bfa152fd3134608b-d1fdf7ba230b9/slave-agent.jnlp
                  HOME:			/home/jenkins
            Conditions:
              Type		Status
              Initialized 	True 
              Ready 	True 
              PodScheduled 	True 
            Volumes:
              workspace-volume:
                Type:	EmptyDir (a temporary directory that shares a pod's lifetime)
                Medium:	
              default-token-y9hsd:
                Type:	Secret (a volume populated by a Secret)
                SecretName:	default-token-y9hsd
            QoS Class:	Burstable
            Tolerations:	<none>
            Events:
              FirstSeen	LastSeen	Count	From						SubObjectPath			Type		Reason		Message
              ---------	--------	-----	----						-------------			--------	------		-------
              1m		1m		1	{default-scheduler }								Normal		Scheduled	Successfully assigned kubernetes-9f544f8d984342c8bfa152fd3134608b-d1fdf7ba230b9 to gke-pci-default-pool-44b03267-0ztl
              1m		1m		1	{kubelet gke-pci-default-pool-44b03267-0ztl}	spec.containers{jnlp}		Normal		Pulled		Container image "jenkinsci/jnlp-slave:alpine" already present on machine
              1m		1m		1	{kubelet gke-pci-default-pool-44b03267-0ztl}	spec.containers{jnlp}		Normal		Created		Created container with docker id b27bb762a035; Security:[seccomp=unconfined]
              1m		1m		1	{kubelet gke-pci-default-pool-44b03267-0ztl}	spec.containers{jnlp}		Normal		Started		Started container with docker id b27bb762a035
              1m		1m		1	{kubelet gke-pci-default-pool-44b03267-0ztl}	spec.containers{gcloud-jdk7}	Normal		Pulling		pulling image "gcr.io/pc-infrastructure/robusta-jenkins-gcloud-jdk7"
              1m		1m		1	{kubelet gke-pci-default-pool-44b03267-0ztl}	spec.containers{gcloud-jdk7}	Normal		Pulled		Successfully pulled image "gcr.io/pc-infrastructure/robusta-jenkins-gcloud-jdk7"
              1m		1m		1	{kubelet gke-pci-default-pool-44b03267-0ztl}	spec.containers{gcloud-jdk7}	Normal		Created		Created container with docker id 28c3b60ae04d; Security:[seccomp=unconfined]
              1m		1m		1	{kubelet gke-pci-default-pool-44b03267-0ztl}	spec.containers{gcloud-jdk7}	Normal		Started		Started container with docker id 28c3b60ae04d
            
            Show
            larslawoko Lars Lawoko added a comment - - edited For more context we are running on the Google container engine (hosted k8), The weird thing is that it seems to be working right, i.e the pipeline builds, even with the constant exception. The exception starts once the gradle shell is run: [Pipeline] } [Pipeline] // stage [Pipeline] stage [Pipeline] { (Build & run unit tests) [Pipeline] withEnv [Pipeline] { [Pipeline] sh 00:01:05.567 [Robusta_robusta_develop-6EPNQBJK5BYEXOJV6L45MMZZGUIP7WO4Y6EGRUYNFFMRC7B2GL3A] Running shell script 00:01:05.572 Executing shell script inside container [gcloud-jdk7] of pod [kubernetes-9f544f8d984342c8bfa152fd3134608b-d1fdf7ba230b9] 00:01:05.653 Executing command: sh -c echo $$ > '/home/jenkins/workspace/Robusta_robusta_develop-6EPNQBJK5BYEXOJV6L45MMZZGUIP7WO4Y6EGRUYNFFMRC7B2GL3A@tmp/durable-aa4bf913/pid' ; jsc=durable-ca85172bfb8670e4c44f30557e14af18; JENKINS_SERVER_COOKIE=$jsc '/home/jenkins/workspace/Robusta_robusta_develop-6EPNQBJK5BYEXOJV6L45MMZZGUIP7WO4Y6EGRUYNFFMRC7B2GL3A@tmp/durable-aa4bf913/script.sh' > '/home/jenkins/workspace/Robusta_robusta_develop-6EPNQBJK5BYEXOJV6L45MMZZGUIP7WO4Y6EGRUYNFFMRC7B2GL3A@tmp/durable-aa4bf913/jenkins-log.txt' 2>&1; echo $? > '/home/jenkins/workspace/Robusta_robusta_develop-6EPNQBJK5BYEXOJV6L45MMZZGUIP7WO4Y6EGRUYNFFMRC7B2GL3A@tmp/durable-aa4bf913/jenkins-result.txt' 00:01:05.694 # cd /home/jenkins/workspace/Robusta_robusta_develop-6EPNQBJK5BYEXOJV6L45MMZZGUIP7WO4Y6EGRUYNFFMRC7B2GL3A 00:01:05.694 sh -c echo $$ > '/home/jenkins/workspace/Robusta_robusta_develop-6EPNQBJK5BYEXOJV6L45MMZZGUIP7WO4Y6EGRUYNFFMRC7B2GL3A@tmp/durable-aa4bf913/pid' ; jsc=durable-ca85172bfb8670e4c44f30557e14af18; JENKINS_SERVER_COOKIE=$jsc '/home/jenkins/workspace/Robusta_robusta_develop-6EPNQBJK5BYEXOJV6L45MMZZGUIP7WO4Y6EGRUYNFFMRC7B2GL3A@tmp/durable-aa4bf913/script.sh' > '/home/jenkins/workspace/Robusta_robusta_develop-6E# PNQBJK5BYEXOJV6L45MMZZGUIP7WO4Y6EGRUYNFFMRC7B2GL3A@tmp/durable-aa4bf913/jenkins-log.txt' 2>&1; echo $? > '/home/jenkins/workspace/Robusta_robusta_develop-6EPNQBJK5BYEXOJV6L45MMZZGUIP7WO4Y6EGRUYNFFMRC7B2GL3A@tmp/durable-aa4bf913/jenkins-result.txt' 00:01:05.694 exit 00:01:05.909 + ./gradlew --stacktrace --parallel buildUnitTest 00:01:05.909 Downloading https: //services.gradle.org/distributions/gradle-3.3-all.zip 00:01:05.995 Cannot contact kubernetes-9f544f8d984342c8bfa152fd3134608b-d1fdf7ba230b9: java.io.IOException: corrupted content in /home/jenkins/workspace/Robusta_robusta_develop-6EPNQBJK5BYEXOJV6L45MMZZGUIP7WO4Y6EGRUYNFFMRC7B2GL3A@tmp/durable-aa4bf913/pid: java.lang.NumberFormatException: For input string: "" Our jenkinsfile is pretty massive, but here is the core (has been edited): podTemplate(label: 'JavaPod' , containers: [ containerTemplate( name: 'gcloud-jdk7' , image: 'gcr.io/pc-infrastructure/robusta-jenkins-gcloud-jdk7' , ttyEnabled: true , args: 'cat' , command: '/bin/sh -c' , alwaysPullImage: true , workingDir: '/home/jenkins' , resourceRequestCpu: '2' , resourceRequestMemory: '8Gi' , resourceLimitCpu: '5' , resourceLimitMemory: '9Gi' , ), containerTemplate( name: 'jnlp' , image: 'jenkinsci/jnlp-slave:alpine' , args: '${computer.jnlpmac} ${computer.name}' , resourceRequestCpu: '100m' , resourceRequestMemory: '500Mi' , resourceLimitCpu: '500m' , resourceLimitMemory: '1Gi' , ) ]) { node( 'JavaPod' ) { container( 'gcloud-jdk7' ) { timeout(30) { //assume something is wrong if it takes an half an hour stage( 'checkout source' ) { checkout scm } switch (env.BRANCH_NAME) { case 'develop' : buildUnitTest() runIntegrationTests( 'local' ) } } } void buildUnitTest() { stage( 'Build & run unit tests' ) { withEnv(runEnv) { try { def command = './gradlew --stacktrace --parallel buildUnitTest' if (env.BRANCH_NAME == 'master' ){ command = 'export ROBUSTA_PROD_ANALYTICS= true && ' + command } sh command } catch (Exception e) { junit allowEmptyResults: true , testResults: '**/build/test-results /**/ *.xml' step([$class: 'CheckStylePublisher' , canComputeNew: false , defaultEncoding: '', healthy: ' ', pattern: ' **/main.xml,**/test.xml ', unHealthy: ' ']) throw e } junit allowEmptyResults: true , testResults: '**/build/test-results /**/ *.xml' step([$class: 'CheckStylePublisher' , canComputeNew: false , defaultEncoding: '', healthy: ' ', pattern: ' **/main.xml,**/test.xml ', unHealthy: ' ']) } } } void runIntegrationTests( String targetEnv) { stage( 'Run integration tests' ) { withEnv(runEnv) { try { sh "./gradlew --stacktrace :robusta-integration-tests:integrationTest -PtestEnv=${targetEnv}" } catch (Exception e) { junit allowEmptyResults: true , testResults: '**/build/test-results /**/ *.xml' throw e } junit allowEmptyResults: true , testResults: '**/build/test-results /**/ *.xml' } } } kubectl describe pod kubernetes-9f544f8d984342c8bfa152fd3134608b-d1fdf7ba230b9 Name: kubernetes-9f544f8d984342c8bfa152fd3134608b-d1fdf7ba230b9 Namespace: default Node: gke-pci- default -pool-44b03267-0ztl/10.140.0.2 Start Time: Thu, 16 Feb 2017 09:38:06 +1100 Labels: jenkins=slave jenkins/JavaPod= true Status: Running IP: 10.40.20.94 Controllers: <none> Containers: gcloud-jdk7: Container ID: docker: //28c3b60ae04d952cce366d5a03c2d950a171594828fd19446fe2aa9ed379dd33 Image: gcr.io/pc-infrastructure/robusta-jenkins-gcloud-jdk7 Image ID: docker: //sha256:16e905ffe4f3393f6ee4b5125971a3029d6162ca0d3db5b2973f1f13b6201c3f Port: Command: /bin/sh -c Args: cat Limits: cpu: 5 memory: 9Gi Requests: cpu: 2 memory: 8Gi State: Running Started: Thu, 16 Feb 2017 09:38:07 +1100 Ready: True Restart Count: 0 Volume Mounts: /home/jenkins from workspace-volume (rw) / var /run/secrets/kubernetes.io/serviceaccount from default -token-y9hsd (ro) Environment Variables: JENKINS_SECRET: 16b02724728739b72e0b559940adc7b5da29e9e190e8a35e858cece4bbc92346 JENKINS_NAME: kubernetes-9f544f8d984342c8bfa152fd3134608b-d1fdf7ba230b9 JENKINS_LOCATION_URL: https: //build-robusta.papercut.software/ JENKINS_URL: http: //kubectl describe pod kubernetes-9f544f8d984342c8bfa152fd3134608b-d1fdf7ba230b9 Name: kubernetes-9f544f8d984342c8bfa152fd3134608b-d1fdf7ba230b9 Namespace: default Node: gke-pci- default -pool-44b03267-0ztl/10.140.0.2 Start Time: Thu, 16 Feb 2017 09:38:06 +1100 Labels: jenkins=slave jenkins/JavaPod= true Status: Running IP: 10.40.20.94 Controllers: <none> Containers: gcloud-jdk7: Container ID: docker: //28c3b60ae04d952cce366d5a03c2d950a171594828fd19446fe2aa9ed379dd33 Image: gcr.io/pc-infrastructure/robusta-jenkins-gcloud-jdk7 Image ID: docker: //sha256:16e905ffe4f3393f6ee4b5125971a3029d6162ca0d3db5b2973f1f13b6201c3f Port: Command: /bin/sh -c Args: cat Limits: cpu: 5 memory: 9Gi Requests: cpu: 2 memory: 8Gi State: Running Started: Thu, 16 Feb 2017 09:38:07 +1100 Ready: True Restart Count: 0 Volume Mounts: /home/jenkins from workspace-volume (rw) / var /run/secrets/kubernetes.io/serviceaccount from default -token-y9hsd (ro) Environment Variables: JENKINS_SECRET: 16b02724728739b72e0b559940adc7b5da29e9e190e8a35e858cece4bbc92346 JENKINS_NAME: kubernetes-9f544f8d984342c8bfa152fd3134608b-d1fdf7ba230b9 JENKINS_LOCATION_URL: https: //build-robusta.papercut.software/ JENKINS_URL: http: //build-robusta JENKINS_JNLP_URL: http: //build-robusta/computer/kubernetes-9f544f8d984342c8bfa152fd3134608b-d1fdf7ba230b9/slave-agent.jnlp HOME: /home/jenkins jnlp: Container ID: docker: //b27bb762a03525763aee7d2a60a85b4c3331aa91c6ac7d40b40693f570c1b564 Image: jenkinsci/jnlp-slave:alpine Image ID: docker: //sha256:254fd665eaf0229f38295a9eac6c7f9bf32a2f450ecbcc8212f3e53b96dd339d Port: Args: 16b02724728739b72e0b559940adc7b5da29e9e190e8a35e858cece4bbc92346 kubernetes-9f544f8d984342c8bfa152fd3134608b-d1fdf7ba230b9 Limits: cpu: 500m memory: 1Gi Requests: cpu: 100m memory: 500Mi State: Running Started: Thu, 16 Feb 2017 09:38:06 +1100 Ready: True Restart Count: 0 Volume Mounts: /home/jenkins from workspace-volume (rw) / var /run/secrets/kubernetes.io/serviceaccount from default -token-y9hsd (ro) Environment Variables: JENKINS_SECRET: <secret> JENKINS_NAME: kubernetes-9f544f8d984342c8bfa152fd3134608b-d1fdf7ba230b9 JENKINS_LOCATION_URL: <jenkins url> (I replaced) JENKINS_URL: <jenkins url> (I replaced) JENKINS_JNLP_URL: <jenkins url> (I replaced)/computer/kubernetes-9f544f8d984342c8bfa152fd3134608b-d1fdf7ba230b9/slave-agent.jnlp HOME: /home/jenkins Conditions: Type Status Initialized True Ready True PodScheduled True Volumes: workspace-volume: Type: EmptyDir (a temporary directory that shares a pod's lifetime) Medium: default -token-y9hsd: Type: Secret (a volume populated by a Secret) SecretName: default -token-y9hsd QoS Class : Burstable Tolerations: <none> Events: FirstSeen LastSeen Count From SubObjectPath Type Reason Message --------- -------- ----- ---- ------------- -------- ------ ------- 1m 1m 1 { default -scheduler } Normal Scheduled Successfully assigned kubernetes-9f544f8d984342c8bfa152fd3134608b-d1fdf7ba230b9 to gke-pci- default -pool-44b03267-0ztl 1m 1m 1 {kubelet gke-pci- default -pool-44b03267-0ztl} spec.containers{jnlp} Normal Pulled Container image "jenkinsci/jnlp-slave:alpine" already present on machine 1m 1m 1 {kubelet gke-pci- default -pool-44b03267-0ztl} spec.containers{jnlp} Normal Created Created container with docker id b27bb762a035; Security:[seccomp=unconfined] 1m 1m 1 {kubelet gke-pci- default -pool-44b03267-0ztl} spec.containers{jnlp} Normal Started Started container with docker id b27bb762a035 1m 1m 1 {kubelet gke-pci- default -pool-44b03267-0ztl} spec.containers{gcloud-jdk7} Normal Pulling pulling image "gcr.io/pc-infrastructure/robusta-jenkins-gcloud-jdk7" 1m 1m 1 {kubelet gke-pci- default -pool-44b03267-0ztl} spec.containers{gcloud-jdk7} Normal Pulled Successfully pulled image "gcr.io/pc-infrastructure/robusta-jenkins-gcloud-jdk7" 1m 1m 1 {kubelet gke-pci- default -pool-44b03267-0ztl} spec.containers{gcloud-jdk7} Normal Created Created container with docker id 28c3b60ae04d; Security:[seccomp=unconfined] 1m 1m 1 {kubelet gke-pci- default -pool-44b03267-0ztl} spec.containers{gcloud-jdk7} Normal Started Started container with docker id 28c3b60ae04d JENKINS_JNLP_URL: <jenkins url> (I replaced)/computer/kubernetes-9f544f8d984342c8bfa152fd3134608b-d1fdf7ba230b9/slave-agent.jnlp HOME: /home/jenkins jnlp: Container ID: docker: //b27bb762a03525763aee7d2a60a85b4c3331aa91c6ac7d40b40693f570c1b564 Image: jenkinsci/jnlp-slave:alpine Image ID: docker: //sha256:254fd665eaf0229f38295a9eac6c7f9bf32a2f450ecbcc8212f3e53b96dd339d Port: Args: 16b02724728739b72e0b559940adc7b5da29e9e190e8a35e858cece4bbc92346 kubernetes-9f544f8d984342c8bfa152fd3134608b-d1fdf7ba230b9 Limits: cpu: 500m memory: 1Gi Requests: cpu: 100m memory: 500Mi State: Running Started: Thu, 16 Feb 2017 09:38:06 +1100 Ready: True Restart Count: 0 Volume Mounts: /home/jenkins from workspace-volume (rw) / var /run/secrets/kubernetes.io/serviceaccount from default -token-y9hsd (ro) Environment Variables: JENKINS_SECRET: <secret> JENKINS_NAME: kubernetes-9f544f8d984342c8bfa152fd3134608b-d1fdf7ba230b9 JENKINS_LOCATION_URL: <jenkins url> (I replaced) JENKINS_URL: <jenkins url> (I replaced) JENKINS_JNLP_URL: <jenkins url> (I replaced)/computer/kubernetes-9f544f8d984342c8bfa152fd3134608b-d1fdf7ba230b9/slave-agent.jnlp HOME: /home/jenkins Conditions: Type Status Initialized True Ready True PodScheduled True Volumes: workspace-volume: Type: EmptyDir (a temporary directory that shares a pod's lifetime) Medium: default -token-y9hsd: Type: Secret (a volume populated by a Secret) SecretName: default -token-y9hsd QoS Class : Burstable Tolerations: <none> Events: FirstSeen LastSeen Count From SubObjectPath Type Reason Message --------- -------- ----- ---- ------------- -------- ------ ------- 1m 1m 1 { default -scheduler } Normal Scheduled Successfully assigned kubernetes-9f544f8d984342c8bfa152fd3134608b-d1fdf7ba230b9 to gke-pci- default -pool-44b03267-0ztl 1m 1m 1 {kubelet gke-pci- default -pool-44b03267-0ztl} spec.containers{jnlp} Normal Pulled Container image "jenkinsci/jnlp-slave:alpine" already present on machine 1m 1m 1 {kubelet gke-pci- default -pool-44b03267-0ztl} spec.containers{jnlp} Normal Created Created container with docker id b27bb762a035; Security:[seccomp=unconfined] 1m 1m 1 {kubelet gke-pci- default -pool-44b03267-0ztl} spec.containers{jnlp} Normal Started Started container with docker id b27bb762a035 1m 1m 1 {kubelet gke-pci- default -pool-44b03267-0ztl} spec.containers{gcloud-jdk7} Normal Pulling pulling image "gcr.io/pc-infrastructure/robusta-jenkins-gcloud-jdk7" 1m 1m 1 {kubelet gke-pci- default -pool-44b03267-0ztl} spec.containers{gcloud-jdk7} Normal Pulled Successfully pulled image "gcr.io/pc-infrastructure/robusta-jenkins-gcloud-jdk7" 1m 1m 1 {kubelet gke-pci- default -pool-44b03267-0ztl} spec.containers{gcloud-jdk7} Normal Created Created container with docker id 28c3b60ae04d; Security:[seccomp=unconfined] 1m 1m 1 {kubelet gke-pci- default -pool-44b03267-0ztl} spec.containers{gcloud-jdk7} Normal Started Started container with docker id 28c3b60ae04d
            Hide
            daichirata Daichi Hirata added a comment -

            The same kind of problem has occurred in my Jenkins running on GKE. This problem occurred by upgrading Pipeline Nodes and Processes Plugin from 2.8 to 2.9. I am confirming that this problem temporarily resolves by downgrading that plugin from 2.9 to 2.8.

            Show
            daichirata Daichi Hirata added a comment - The same kind of problem has occurred in my Jenkins running on GKE. This problem occurred by upgrading Pipeline Nodes and Processes Plugin from 2.8 to 2.9. I am confirming that this problem temporarily resolves by downgrading that plugin from 2.9 to 2.8.
            larslawoko Lars Lawoko made changes -
            Field Original Value New Value
            Component/s workflow-durable-task-step-plugin [ 21715 ]
            Hide
            larslawoko Lars Lawoko added a comment - - edited

            Daichi Hirata Thanks, just tried this and can confirm that Pipeline Nodes and Processes Plugin 2.9 is the issue

            Ioannis Canellos He narrowed it down to this plugin / version

            Show
            larslawoko Lars Lawoko added a comment - - edited Daichi Hirata Thanks, just tried this and can confirm that Pipeline Nodes and Processes Plugin 2.9 is the issue Ioannis Canellos He narrowed it down to this plugin / version
            Hide
            iocanel Ioannis Canellos added a comment -

            Awesome, that's really helpfull!

            Unfortunately, I can only help regarding the kubernetes-plugin, can you reassign the issue to someone involved with `Pipeline Nodes and Processes Plugin`?

            Show
            iocanel Ioannis Canellos added a comment - Awesome, that's really helpfull! Unfortunately, I can only help regarding the kubernetes-plugin, can you reassign the issue to someone involved with `Pipeline Nodes and Processes Plugin`?
            Hide
            larslawoko Lars Lawoko added a comment - - edited

            Ioannis Canellos Will do, i believe Jesse Glick is part of that plugin team?

            Also if you are in the kubernetes team and have time, could you have a quick look at JENKINS-40647 , I can confirm that it is still happening with the plugins updated to latest.

            Show
            larslawoko Lars Lawoko added a comment - - edited Ioannis Canellos Will do, i believe Jesse Glick is part of that plugin team? Also if you are in the kubernetes team and have time, could you have a quick look at JENKINS-40647 , I can confirm that it is still happening with the plugins updated to latest.
            larslawoko Lars Lawoko made changes -
            Assignee Ioannis Canellos [ iocanel ] Jesse Glick [ jglick ]
            Hide
            jglick Jesse Glick added a comment -

            Unless there is some way to reproduce without Kubernetes, or someone using Kubernetes is willing to debug why an empty PID file is being written, I have nothing to go on.

            Show
            jglick Jesse Glick added a comment - Unless there is some way to reproduce without Kubernetes, or someone using Kubernetes is willing to debug why an empty PID file is being written, I have nothing to go on.
            jglick Jesse Glick made changes -
            Assignee Jesse Glick [ jglick ]
            Hide
            juhtie01 Juha Tiensyrjä added a comment -

            I could probably find some time for debugging this week. Just let me know what to do.

            Show
            juhtie01 Juha Tiensyrjä added a comment - I could probably find some time for debugging this week. Just let me know what to do.
            Hide
            iocanel Ioannis Canellos added a comment -

            Can you set the `workingDir` on the jnlp container and tell me if that makes any difference?

            Show
            iocanel Ioannis Canellos added a comment - Can you set the `workingDir` on the jnlp container and tell me if that makes any difference?
            Hide
            juhtie01 Juha Tiensyrjä added a comment -

            I tried that, the problem seems to be the same:

            [Pipeline] node
            Running on test-55f1b2f811564473b115b8af4962a8ad-1b98a80d910eec in /var/tmp/workspace/Playground/test-JENKINS-42048
            [Pipeline] {
            [Pipeline] sh
            [test-JENKINS-42048] Running shell script
            + echo Foo
            Foo
            + sleep 10
            [Pipeline] container
            [Pipeline] {
            [Pipeline] sh
            [test-JENKINS-42048] Running shell script
            Executing shell script inside container [jnlp] of pod [test-55f1b2f811564473b115b8af4962a8ad-1b98a80d910eec]
            Executing command: sh -c echo $$ > '/var/tmp/workspace/Playground/test-JENKINS-42048@tmp/durable-b2162d32/pid'; jsc=durable-51e972f92a0e472b7953c41703e464ca; JENKINS_SERVER_COOKIE=$jsc '/var/tmp/workspace/Playground/test-JENKINS-42048@tmp/durable-b2162d32/script.sh' > '/var/tmp/workspace/Playground/test-JENKINS-42048@tmp/durable-b2162d32/jenkins-log.txt' 2>&1; echo $? > '/var/tmp/workspace/Playground/test-JENKINS-42048@tmp/durable-b2162d32/jenkins-result.txt' 
            $ cd "/var/tmp/workspace/Playground/test-JENKINS-42048"
            sh -c echo $$ > '/var/tmp/workspace/Playground/test-JENKINS-42048@tmp/durable-b2162d32/pid'; jsc=durable-51e972f92a0e472b7953c41703e464ca; JENKINS_SERVER_COOKIE=$jsc '/var/tmp/workspace/Playground/test-JENKINS-42048@tmp/durable-b2162d32/script.sh' > '/var/tmp/workspace/Playground/test-JENKINS-42048@tmp/durable-b2162d32/jenkins-log.txt' 2>&1; echo $? > '/var/tmp/workspace/Playground/test-JENKINS-42048@tmp/durable-b2162d32/jenkins-result.txt' 
            exit
            $ + echo Bar
            Bar
            + sleep 10
            Cannot contact test-55f1b2f811564473b115b8af4962a8ad-1b98a80d910eec: java.io.IOException: corrupted content in /var/tmp/workspace/Playground/test-JENKINS-42048@tmp/durable-b2162d32/pid: java.lang.NumberFormatException: For input string: ""
            ...
            

            The container and the directory is the same, only difference is that first I call it directly from within a node block, then I call it from within a container block.

            Show
            juhtie01 Juha Tiensyrjä added a comment - I tried that, the problem seems to be the same: [Pipeline] node Running on test-55f1b2f811564473b115b8af4962a8ad-1b98a80d910eec in /var/tmp/workspace/Playground/test-JENKINS-42048 [Pipeline] { [Pipeline] sh [test-JENKINS-42048] Running shell script + echo Foo Foo + sleep 10 [Pipeline] container [Pipeline] { [Pipeline] sh [test-JENKINS-42048] Running shell script Executing shell script inside container [jnlp] of pod [test-55f1b2f811564473b115b8af4962a8ad-1b98a80d910eec] Executing command: sh -c echo $$ > '/var/tmp/workspace/Playground/test-JENKINS-42048@tmp/durable-b2162d32/pid'; jsc=durable-51e972f92a0e472b7953c41703e464ca; JENKINS_SERVER_COOKIE=$jsc '/var/tmp/workspace/Playground/test-JENKINS-42048@tmp/durable-b2162d32/script.sh' > '/var/tmp/workspace/Playground/test-JENKINS-42048@tmp/durable-b2162d32/jenkins-log.txt' 2>&1; echo $? > '/var/tmp/workspace/Playground/test-JENKINS-42048@tmp/durable-b2162d32/jenkins-result.txt' $ cd "/var/tmp/workspace/Playground/test-JENKINS-42048" sh -c echo $$ > '/var/tmp/workspace/Playground/test-JENKINS-42048@tmp/durable-b2162d32/pid'; jsc=durable-51e972f92a0e472b7953c41703e464ca; JENKINS_SERVER_COOKIE=$jsc '/var/tmp/workspace/Playground/test-JENKINS-42048@tmp/durable-b2162d32/script.sh' > '/var/tmp/workspace/Playground/test-JENKINS-42048@tmp/durable-b2162d32/jenkins-log.txt' 2>&1; echo $? > '/var/tmp/workspace/Playground/test-JENKINS-42048@tmp/durable-b2162d32/jenkins-result.txt' exit $ + echo Bar Bar + sleep 10 Cannot contact test-55f1b2f811564473b115b8af4962a8ad-1b98a80d910eec: java.io.IOException: corrupted content in /var/tmp/workspace/Playground/test-JENKINS-42048@tmp/durable-b2162d32/pid: java.lang.NumberFormatException: For input string: "" ... The container and the directory is the same, only difference is that first I call it directly from within a node block, then I call it from within a container block.
            Hide
            killdash9 Michael Andrews added a comment - - edited

            We just upgraded the Kubernetes Plugin to 0.11 and are now seeing this as well. Seems to happen on our shell steps (but not all the shell steps).

            Here is the pipeline code:

                               query = '\'Reservations[].Instances[].ImageId\''
                                imagesInUse =
                                    sh(
                                        returnStdout: true,
                                        script: """\
                                            aws ec2 describe-instances \
                                                --region ${region} \
                                                --query ${query} \
                                                --output text
                                        """.stripIndent()
                                    ).trim().split().toUnique()
            
                                echo "Images in-use:\n${imagesInUse}"
            

            ...and here is the logging:

            // Some comments here
            + aws ec2 describe-instances --region us-east-1 --query Reservations[].Instances[].ImageId --output text
            Cannot contact kubernetes-d132fbe56cdf44b589aa03203db4ae55-f2bde129ba: java.io.IOException: corrupted content in /home/jenkins/workspace/ops-maintenance/ops-maintenance-scheduled/clean-up-amis/cleanup-baseline-amis@tmp/durable-fe439038/pid: java.lang.NumberFormatException: For input string: ""
            Cannot contact kubernetes-d132fbe56cdf44b589aa03203db4ae55-f2bde129ba: java.io.IOException: corrupted content in /home/jenkins/workspace/ops-maintenance/ops-maintenance-scheduled/clean-up-amis/cleanup-baseline-amis@tmp/durable-fe439038/pid: java.lang.NumberFormatException: For input string: ""
            Cannot contact kubernetes-d132fbe56cdf44b589aa03203db4ae55-f2bde129ba: java.io.IOException: corrupted content in /home/jenkins/workspace/ops-maintenance/ops-maintenance-scheduled/clean-up-amis/cleanup-baseline-amis@tmp/durable-fe439038/pid: java.lang.NumberFormatException: For input string: ""
            Cannot contact kubernetes-d132fbe56cdf44b589aa03203db4ae55-f2bde129ba: java.io.IOException: corrupted content in /home/jenkins/workspace/ops-maintenance/ops-maintenance-scheduled/clean-up-amis/cleanup-baseline-amis@tmp/durable-fe439038/pid: java.lang.NumberFormatException: For input string: ""
            Cannot contact kubernetes-d132fbe56cdf44b589aa03203db4ae55-f2bde129ba: java.io.IOException: corrupted content in /home/jenkins/workspace/ops-maintenance/ops-maintenance-scheduled/clean-up-amis/cleanup-baseline-amis@tmp/durable-fe439038/pid: java.lang.NumberFormatException: For input string: ""
            Cannot contact kubernetes-d132fbe56cdf44b589aa03203db4ae55-f2bde129ba: java.io.IOException: corrupted content in /home/jenkins/workspace/ops-maintenance/ops-maintenance-scheduled/clean-up-amis/cleanup-baseline-amis@tmp/durable-fe439038/pid: java.lang.NumberFormatException: For input string: ""
            Cannot contact kubernetes-d132fbe56cdf44b589aa03203db4ae55-f2bde129ba: java.io.IOException: corrupted content in /home/jenkins/workspace/ops-maintenance/ops-maintenance-scheduled/clean-up-amis/cleanup-baseline-amis@tmp/durable-fe439038/pid: java.lang.NumberFormatException: For input string: ""
            Cannot contact kubernetes-d132fbe56cdf44b589aa03203db4ae55-f2bde129ba: java.io.IOException: corrupted content in /home/jenkins/workspace/ops-maintenance/ops-maintenance-scheduled/clean-up-amis/cleanup-baseline-amis@tmp/durable-fe439038/pid: java.lang.NumberFormatException: For input string: ""
            Cannot contact kubernetes-d132fbe56cdf44b589aa03203db4ae55-f2bde129ba: java.io.IOException: corrupted content in /home/jenkins/workspace/ops-maintenance/ops-maintenance-scheduled/clean-up-amis/cleanup-baseline-amis@tmp/durable-fe439038/pid: java.lang.NumberFormatException: For input string: ""
            Cannot contact kubernetes-d132fbe56cdf44b589aa03203db4ae55-f2bde129ba: java.io.IOException: corrupted content in /home/jenkins/workspace/ops-maintenance/ops-maintenance-scheduled/clean-up-amis/cleanup-baseline-amis@tmp/durable-fe439038/pid: java.lang.NumberFormatException: For input string: ""
            /home/jenkins/workspace/ops-maintenance/ops-maintenance-scheduled/clean-up-amis/cleanup-baseline-amis # exit
            [Pipeline] echo
            Images in-use:
            [ami-9b0df48d, ami-e8d11dfe, ami-3d3eff2b, ami-c4e81ad2, ami-33576959, ami-cd8b5fdb, ami-ebd205fd, ami-0e367a19, ami-56ed0740, ami-2e3c4c39, ami-4b71095c]
            

            The scripts seem to be ok - and the job does not fail.

            Show
            killdash9 Michael Andrews added a comment - - edited We just upgraded the Kubernetes Plugin to 0.11 and are now seeing this as well. Seems to happen on our shell steps (but not all the shell steps). Here is the pipeline code: query = '\' Reservations[].Instances[].ImageId\'' imagesInUse = sh( returnStdout: true , script: """\ aws ec2 describe-instances \ --region ${region} \ --query ${query} \ --output text """.stripIndent() ).trim().split().toUnique() echo "Images in-use:\n${imagesInUse}" ...and here is the logging: // Some comments here + aws ec2 describe-instances --region us-east-1 --query Reservations[].Instances[].ImageId --output text Cannot contact kubernetes-d132fbe56cdf44b589aa03203db4ae55-f2bde129ba: java.io.IOException: corrupted content in /home/jenkins/workspace/ops-maintenance/ops-maintenance-scheduled/clean-up-amis/cleanup-baseline-amis@tmp/durable-fe439038/pid: java.lang.NumberFormatException: For input string: "" Cannot contact kubernetes-d132fbe56cdf44b589aa03203db4ae55-f2bde129ba: java.io.IOException: corrupted content in /home/jenkins/workspace/ops-maintenance/ops-maintenance-scheduled/clean-up-amis/cleanup-baseline-amis@tmp/durable-fe439038/pid: java.lang.NumberFormatException: For input string: "" Cannot contact kubernetes-d132fbe56cdf44b589aa03203db4ae55-f2bde129ba: java.io.IOException: corrupted content in /home/jenkins/workspace/ops-maintenance/ops-maintenance-scheduled/clean-up-amis/cleanup-baseline-amis@tmp/durable-fe439038/pid: java.lang.NumberFormatException: For input string: "" Cannot contact kubernetes-d132fbe56cdf44b589aa03203db4ae55-f2bde129ba: java.io.IOException: corrupted content in /home/jenkins/workspace/ops-maintenance/ops-maintenance-scheduled/clean-up-amis/cleanup-baseline-amis@tmp/durable-fe439038/pid: java.lang.NumberFormatException: For input string: "" Cannot contact kubernetes-d132fbe56cdf44b589aa03203db4ae55-f2bde129ba: java.io.IOException: corrupted content in /home/jenkins/workspace/ops-maintenance/ops-maintenance-scheduled/clean-up-amis/cleanup-baseline-amis@tmp/durable-fe439038/pid: java.lang.NumberFormatException: For input string: "" Cannot contact kubernetes-d132fbe56cdf44b589aa03203db4ae55-f2bde129ba: java.io.IOException: corrupted content in /home/jenkins/workspace/ops-maintenance/ops-maintenance-scheduled/clean-up-amis/cleanup-baseline-amis@tmp/durable-fe439038/pid: java.lang.NumberFormatException: For input string: "" Cannot contact kubernetes-d132fbe56cdf44b589aa03203db4ae55-f2bde129ba: java.io.IOException: corrupted content in /home/jenkins/workspace/ops-maintenance/ops-maintenance-scheduled/clean-up-amis/cleanup-baseline-amis@tmp/durable-fe439038/pid: java.lang.NumberFormatException: For input string: "" Cannot contact kubernetes-d132fbe56cdf44b589aa03203db4ae55-f2bde129ba: java.io.IOException: corrupted content in /home/jenkins/workspace/ops-maintenance/ops-maintenance-scheduled/clean-up-amis/cleanup-baseline-amis@tmp/durable-fe439038/pid: java.lang.NumberFormatException: For input string: "" Cannot contact kubernetes-d132fbe56cdf44b589aa03203db4ae55-f2bde129ba: java.io.IOException: corrupted content in /home/jenkins/workspace/ops-maintenance/ops-maintenance-scheduled/clean-up-amis/cleanup-baseline-amis@tmp/durable-fe439038/pid: java.lang.NumberFormatException: For input string: "" Cannot contact kubernetes-d132fbe56cdf44b589aa03203db4ae55-f2bde129ba: java.io.IOException: corrupted content in /home/jenkins/workspace/ops-maintenance/ops-maintenance-scheduled/clean-up-amis/cleanup-baseline-amis@tmp/durable-fe439038/pid: java.lang.NumberFormatException: For input string: "" /home/jenkins/workspace/ops-maintenance/ops-maintenance-scheduled/clean-up-amis/cleanup-baseline-amis # exit [Pipeline] echo Images in-use: [ami-9b0df48d, ami-e8d11dfe, ami-3d3eff2b, ami-c4e81ad2, ami-33576959, ami-cd8b5fdb, ami-ebd205fd, ami-0e367a19, ami-56ed0740, ami-2e3c4c39, ami-4b71095c] The scripts seem to be ok - and the job does not fail.
            Hide
            jglick Jesse Glick added a comment -

            BTW workflow-durable-task-step 2.9 does add this log message but it is just exposing a problem that was already there, and simply being suppressed unless you were running a sufficiently fine logger. The problem is that this code is seeing a file which is supposed to contain a number once created, whereas it is being created as empty for some reason.

            Show
            jglick Jesse Glick added a comment - BTW workflow-durable-task-step 2.9 does add this log message but it is just exposing a problem that was already there, and simply being suppressed unless you were running a sufficiently fine logger. The problem is that this code is seeing a file which is supposed to contain a number once created, whereas it is being created as empty for some reason.
            Hide
            juhtie01 Juha Tiensyrjä added a comment -

            I tried to do an 'echo $$' in different ways within a container running in Kubernetes:

            $ sh -c "echo $$"
            4118
            $ sh -c echo $$
            
            $ sh -c 'echo $$'
            4187
            

            So it looks like the shell command which is responsible for creating the PID file doesn't work correctly. Please correct me if I'm wrong, but it looks like this happens in https://github.com/jenkinsci/durable-task-plugin/blob/master/src/main/java/org/jenkinsci/plugins/durabletask/BourneShellScript.java#L119 ?

            Show
            juhtie01 Juha Tiensyrjä added a comment - I tried to do an 'echo $$' in different ways within a container running in Kubernetes: $ sh -c "echo $$" 4118 $ sh -c echo $$ $ sh -c 'echo $$' 4187 So it looks like the shell command which is responsible for creating the PID file doesn't work correctly. Please correct me if I'm wrong, but it looks like this happens in https://github.com/jenkinsci/durable-task-plugin/blob/master/src/main/java/org/jenkinsci/plugins/durabletask/BourneShellScript.java#L119 ?
            Hide
            jglick Jesse Glick added a comment -

            Yes that is where the script is created which stores its own PID.

            Show
            jglick Jesse Glick added a comment - Yes that is where the script is created which stores its own PID.
            Hide
            juhtie01 Juha Tiensyrjä added a comment -

            Great - can it be fixed so that the `cmd` is enclosed in quotes? That should make this problem disappear as then the `echo $$` command should work just fine.

            Show
            juhtie01 Juha Tiensyrjä added a comment - Great - can it be fixed so that the `cmd` is enclosed in quotes? That should make this problem disappear as then the `echo $$` command should work just fine.
            Hide
            jglick Jesse Glick added a comment -

            It should not be enclosed in quotes. It is a single argument to sh -c.

            Maybe the Kubernetes plugin has a buggy Launcher?

            Show
            jglick Jesse Glick added a comment - It should not be enclosed in quotes. It is a single argument to sh -c . Maybe the Kubernetes plugin has a buggy Launcher ?
            Hide
            juhtie01 Juha Tiensyrjä added a comment -
                private static String[] getCommands(Launcher.ProcStarter starter) {
                    List<String> allCommands = new ArrayList<String>();
            
            
                    boolean first = true;
                    String previous = "";
                    String previousPrevious = "";
                    for (String cmd : starter.cmds()) {
                        if (first && "nohup".equals(cmd)) {
                            first = false;
                            continue;
                        }
                        if ("sh".equals(previousPrevious) && "-c".equals(previous)) {
                            cmd = String.format("\"%s\"", cmd);
                        }
                        previousPrevious = previous;
                        previous = cmd;
                        //I shouldn't been doing that, but clearly the script that is passed to us is wrong?
                        allCommands.add(cmd.replaceAll("\\$\\$", "\\$"));
                    }
                    return allCommands.toArray(new String[allCommands.size()]);
                }
            

            That should take care of adding the quotes around the shell command argument, which kind of works. I start to get this error instead:

            Executing shell script inside container [debian] of pod [test-f669ba016c06421092b43fbd8b23e3d1-f2d539661013]
            Executing command: ps -o pid= 9999 
            # cd "/home/jenkins/workspace/Playground/test-JENKINS-42048"
            ps -o pid= 9999 
            exit
            # # command terminated with non-zero exit code: Error executing in Docker Container: 1Executing shell script inside container [debian] of pod [test-f669ba016c06421092b43fbd8b23e3d1-f2d539661013]
            Executing command: ps -o pid= 6 
            [Pipeline] }
            [Pipeline] // container
            [Pipeline] }
            [Pipeline] // node
            [Pipeline] }
            [Pipeline] // stage
            [Pipeline] }
            [Pipeline] // podTemplate
            [Pipeline] End of Pipeline
            ERROR: script returned exit code -1
            Finished: FAILURE
            

            It looks like the Kubernetes plugin launcher can't handle any errors in the shell scripts and instead of returning true/false on the aliveness probe in https://github.com/jenkinsci/durable-task-plugin/blob/e174ce7f11e31da0a29c6d3af8023de48c269654/src/main/java/org/jenkinsci/plugins/durabletask/ProcessLiveness.java#L87 it drops dead. Any good ideas, anyone?

            Show
            juhtie01 Juha Tiensyrjä added a comment - private static String[] getCommands(Launcher.ProcStarter starter) { List<String> allCommands = new ArrayList<String>(); boolean first = true; String previous = ""; String previousPrevious = ""; for (String cmd : starter.cmds()) { if (first && "nohup".equals(cmd)) { first = false; continue; } if ("sh".equals(previousPrevious) && "-c".equals(previous)) { cmd = String.format("\"%s\"", cmd); } previousPrevious = previous; previous = cmd; //I shouldn't been doing that, but clearly the script that is passed to us is wrong? allCommands.add(cmd.replaceAll("\\$\\$", "\\$")); } return allCommands.toArray(new String[allCommands.size()]); } That should take care of adding the quotes around the shell command argument, which kind of works. I start to get this error instead: Executing shell script inside container [debian] of pod [test-f669ba016c06421092b43fbd8b23e3d1-f2d539661013] Executing command: ps -o pid= 9999 # cd "/home/jenkins/workspace/Playground/test-JENKINS-42048" ps -o pid= 9999 exit # # command terminated with non-zero exit code: Error executing in Docker Container: 1Executing shell script inside container [debian] of pod [test-f669ba016c06421092b43fbd8b23e3d1-f2d539661013] Executing command: ps -o pid= 6 [Pipeline] } [Pipeline] // container [Pipeline] } [Pipeline] // node [Pipeline] } [Pipeline] // stage [Pipeline] } [Pipeline] // podTemplate [Pipeline] End of Pipeline ERROR: script returned exit code -1 Finished: FAILURE It looks like the Kubernetes plugin launcher can't handle any errors in the shell scripts and instead of returning true/false on the aliveness probe in https://github.com/jenkinsci/durable-task-plugin/blob/e174ce7f11e31da0a29c6d3af8023de48c269654/src/main/java/org/jenkinsci/plugins/durabletask/ProcessLiveness.java#L87 it drops dead. Any good ideas, anyone?
            Hide
            jglick Jesse Glick added a comment -

            It looks like the Kubernetes plugin launcher can't handle any errors in the shell scripts

            Maybe my warning was right?

            Show
            jglick Jesse Glick added a comment - It looks like the Kubernetes plugin launcher can't handle any errors in the shell scripts Maybe my warning was right?
            Hide
            jredl Jesse Redl added a comment -

            It looks like the comments have already covered this bug, but when I upgrade to 2.9 I see these same exceptions in our logs. I have downgraded the plugin to version 2.8 and everything is fine.

            [pylint_test] Cannot contact kubernetes-2035b9ceb44d46db9a42cd8dbc1fa0b7-27af70ee4f39c: java.io.IOException: corrupted content in /home/jenkins/workspace/ogies_AA_sre_datadog-events-YUQOU4YVOIDND57J35UZCATDN3QRXKHXSLMS4JYIUCSNEI6IDZGQ@tmp/durable-05858b77/pid: java.lang.NumberFormatException: For input string: ""
            
            Show
            jredl Jesse Redl added a comment - It looks like the comments have already covered this bug, but when I upgrade to 2.9 I see these same exceptions in our logs. I have downgraded the plugin to version 2.8 and everything is fine. [pylint_test] Cannot contact kubernetes-2035b9ceb44d46db9a42cd8dbc1fa0b7-27af70ee4f39c: java.io.IOException: corrupted content in /home/jenkins/workspace/ogies_AA_sre_datadog-events-YUQOU4YVOIDND57J35UZCATDN3QRXKHXSLMS4JYIUCSNEI6IDZGQ@tmp/durable-05858b77/pid: java.lang.NumberFormatException: For input string: ""
            Hide
            jglick Jesse Glick added a comment -

            Again the bug probably exists in all versions, it is only printed to the build log as of 2.9. You can add a FINE logger to org.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep to verify.

            Show
            jglick Jesse Glick added a comment - Again the bug probably exists in all versions, it is only printed to the build log as of 2.9. You can add a FINE logger to org.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep to verify.
            Hide
            jredl Jesse Redl added a comment -

            Thanks for the clarification, and yes you're correct about the errors being part of 2.8 as per the addition of the above logger.

            Mar 04, 2017 4:52:27 AM FINE org.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep
            could not check /home/jenkins/workspace/ogies_AA_sre_datadog-events-YUQOU4YVOIDND57J35UZCATDN3QRXKHXSLMS4JYIUCSNEI6IDZGQ
            java.lang.NumberFormatException: For input string: ""
            	at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
            	at java.lang.Integer.parseInt(Integer.java:592)
            	at java.lang.Integer.parseInt(Integer.java:615)
            	at org.jenkinsci.plugins.durabletask.BourneShellScript$ShellController.pid(BourneShellScript.java:183)
            Caused: java.io.IOException: corrupted content in /home/jenkins/workspace/ogies_AA_sre_datadog-events-YUQOU4YVOIDND57J35UZCATDN3QRXKHXSLMS4JYIUCSNEI6IDZGQ@tmp/durable-35571e92/pid
            	at org.jenkinsci.plugins.durabletask.BourneShellScript$ShellController.pid(BourneShellScript.java:185)
            	at org.jenkinsci.plugins.durabletask.BourneShellScript$ShellController.exitStatus(BourneShellScript.java:197)
            	at org.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep$Execution$3.call(DurableTaskStep.java:314)
            	at org.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep$Execution$3.call(DurableTaskStep.java:306)
            	at org.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep$Execution$4.call(DurableTaskStep.java:359)
            	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
            	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
            	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
            	at java.lang.Thread.run(Thread.java:745)
            
            Show
            jredl Jesse Redl added a comment - Thanks for the clarification, and yes you're correct about the errors being part of 2.8 as per the addition of the above logger. Mar 04, 2017 4:52:27 AM FINE org.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep could not check /home/jenkins/workspace/ogies_AA_sre_datadog-events-YUQOU4YVOIDND57J35UZCATDN3QRXKHXSLMS4JYIUCSNEI6IDZGQ java.lang.NumberFormatException: For input string: "" at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) at java.lang. Integer .parseInt( Integer .java:592) at java.lang. Integer .parseInt( Integer .java:615) at org.jenkinsci.plugins.durabletask.BourneShellScript$ShellController.pid(BourneShellScript.java:183) Caused: java.io.IOException: corrupted content in /home/jenkins/workspace/ogies_AA_sre_datadog-events-YUQOU4YVOIDND57J35UZCATDN3QRXKHXSLMS4JYIUCSNEI6IDZGQ@tmp/durable-35571e92/pid at org.jenkinsci.plugins.durabletask.BourneShellScript$ShellController.pid(BourneShellScript.java:185) at org.jenkinsci.plugins.durabletask.BourneShellScript$ShellController.exitStatus(BourneShellScript.java:197) at org.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep$Execution$3.call(DurableTaskStep.java:314) at org.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep$Execution$3.call(DurableTaskStep.java:306) at org.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep$Execution$4.call(DurableTaskStep.java:359) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang. Thread .run( Thread .java:745)
            juhtie01 Juha Tiensyrjä made changes -
            Link This issue is duplicated by JENKINS-42316 [ JENKINS-42316 ]
            csanchez Carlos Sanchez made changes -
            Component/s kubernetes-plugin [ 20639 ]
            Hide
            0x89 Martin Sander added a comment -

            This also occurs for me when NOT running on kubernetes.

            Show
            0x89 Martin Sander added a comment - This also occurs for me when NOT running on kubernetes.
            Hide
            georgestark Sascha Vujevic added a comment -

            Could this logoutput be avoided by setting a loglevel in Jenkins ?

            The logfile looks not pretty and it's not easy to filter the relevant logoutput.

            Thank you for your support.

            Show
            georgestark Sascha Vujevic added a comment - Could this logoutput be avoided by setting a loglevel in Jenkins ? The logfile looks not pretty and it's not easy to filter the relevant logoutput. Thank you for your support.
            Hide
            jwhitcraft Jon Whitcraft added a comment -

            Sascha Vujevic

            Unfortunately there is no way to do that.  I pinged Carlos Sanchez about this a few weeks ago and he added the component label but I have not heard anything else.  It would be nice to get this fixed.  it's really annoying when trying to debug issues.

            Show
            jwhitcraft Jon Whitcraft added a comment - Sascha Vujevic Unfortunately there is no way to do that.  I pinged Carlos Sanchez about this a few weeks ago and he added the component label but I have not heard anything else.  It would be nice to get this fixed.  it's really annoying when trying to debug issues.
            jwhitcraft Jon Whitcraft made changes -
            Attachment Screenshot 2017-03-28 17.05.18.png [ 36805 ]
            Hide
            jwhitcraft Jon Whitcraft added a comment -

            Jesse Glick and Carlos Sanchez

            I think i found the issue, the durable-task plugin is expecting a pid file to be written out, but the kubernetes-plugin is streaming the command into the container over an OutputStream, and the echo $$ > just put in an empty line.

            Since everything is running inside of a container on kubernetes, the sh -c echo $$ doesn't actually return the pid.

            Example

            If i run say the docker image vai docker where it gives me an sh shell, i can get the pid

            docker run --rm -it --entrypoint=sh docker
            / # echo $$
            1
            / # exit

            However, if i try the same command but put the -c echo $$ as the container command it doesn't return anything

            docker run --rm -it --entrypoint=sh docker -c echo $$
            
            

             

            I also verified this by trying to execute the same command on my jenkins container in minikube and nothing was returned

            
            

            kubectl -n jenkins exec jenkins-3824495712-m94j6 -c jenkins – sh -c echo $${code}
            I'm unsure of the best way to fix this, but I hope that i've provided some details that will help get it fixed, cause it's really annoying.

            Show
            jwhitcraft Jon Whitcraft added a comment - Jesse Glick and Carlos Sanchez I think i found the issue, the durable-task plugin is expecting a pid file to be written out, but the kubernetes-plugin is streaming the command into the container over an OutputStream, and the echo $$ > just put in an empty line. Since everything is running inside of a container on kubernetes, the sh -c echo $$ doesn't actually return the pid. Example If i run say the docker image vai docker where it gives me an sh shell, i can get the pid docker run --rm -it --entrypoint=sh docker / # echo $$ 1 / # exit However, if i try the same command but put the -c echo $$ as the container command it doesn't return anything docker run --rm -it --entrypoint=sh docker -c echo $$   I also verified this by trying to execute the same command on my jenkins container in minikube and nothing was returned kubectl -n jenkins exec jenkins-3824495712-m94j6 -c jenkins – sh -c echo $${code} I'm unsure of the best way to fix this, but I hope that i've provided some details that will help get it fixed, cause it's really annoying.
            Hide
            jwhitcraft Jon Whitcraft added a comment - - edited

            For now as a workaround, i've recompiled workflow-durable-task-plugin with this line commented out.  IMO the logger a few lines up should produce an error instead of fine, and not taint the job console like it is.

            I'd be willing to issue a PR to make this change, if Jesse Glick agrees.

            Show
            jwhitcraft Jon Whitcraft added a comment - - edited For now as a workaround, i've recompiled workflow-durable-task-plugin with  this line  commented out.  IMO the logger a few lines up should produce an error instead of fine, and not taint the job console like it is. I'd be willing to issue a PR to make this change, if Jesse Glick agrees.
            Hide
            tempor Zuzik Zuzikovitch added a comment -

            Got the same problem. Logs are extremely noisy because of that. Hard to debug a real problem.

            Show
            tempor Zuzik Zuzikovitch added a comment - Got the same problem. Logs are extremely noisy because of that. Hard to debug a real problem.
            Hide
            chancez Chance Zibolski added a comment -

            I'm also effected by this, and it's seriously annoying and makes the plugin extremely painful to use when looking out the output.

            Show
            chancez Chance Zibolski added a comment - I'm also effected by this, and it's seriously annoying and makes the plugin extremely painful to use when looking out the output.
            jwhitcraft Jon Whitcraft made changes -
            Priority Minor [ 4 ] Major [ 3 ]
            larslawoko Lars Lawoko made changes -
            Description I tried to narrow this bug down, but there isn't much information. We just upgraded to all newest plugins, but unfortunately we upgraded a lot at once, so no idea which one.

            This is spamming out logs every few seconds:

            {code}
            00:19:42.695 Cannot contact kubernetes-ef39fe82c8a541be84bd780e4d7c1ddb-ce4d47fc96bfc: java.io.IOException: corrupted content in /home/jenkins/workspace/Robusta_robusta_develop-6EPNQBJK5BYEXOJV6L45MMZZGUIP7WO4Y6EGRUYNFFMRC7B2GL3A@tmp/durable-96fa79b7/pid: java.lang.NumberFormatException: For input string: ""
            00:19:57.758 Cannot contact kubernetes-ef39fe82c8a541be84bd780e4d7c1ddb-ce4d47fc96bfc: java.io.IOException: corrupted content in /home/jenkins/workspace/Robusta_robusta_develop-6EPNQBJK5BYEXOJV6L45MMZZGUIP7WO4Y6EGRUYNFFMRC7B2GL3A@tmp/durable-96fa79b7/pid: java.lang.NumberFormatException: For input string: ""
            00:20:12.769 Cannot contact kubernetes-ef39fe82c8a541be84bd780e4d7c1ddb-ce4d47fc96bfc: java.io.IOException: corrupted content in /home/jenkins/workspace/Robusta_robusta_develop-6EPNQBJK5BYEXOJV6L45MMZZGUIP7WO4Y6EGRUYNFFMRC7B2GL3A@tmp/durable-96fa79b7/pid: java.lang.NumberFormatException: For input string: ""
            {code}
            I tried to narrow this bug down, but there isn't much information. We just upgraded to all newest plugins, but unfortunately we upgraded a lot at once, so no idea which one.

            This is spamming out logs every few seconds:
            {code:java}
            00:19:42.695 Cannot contact kubernetes-ef39fe82c8a541be84bd780e4d7c1ddb-ce4d47fc96bfc: java.io.IOException: corrupted content in /home/jenkins/workspace/Robusta_robusta_develop-6EPNQBJK5BYEXOJV6L45MMZZGUIP7WO4Y6EGRUYNFFMRC7B2GL3A@tmp/durable-96fa79b7/pid: java.lang.NumberFormatException: For input string: ""
            00:19:57.758 Cannot contact kubernetes-ef39fe82c8a541be84bd780e4d7c1ddb-ce4d47fc96bfc: java.io.IOException: corrupted content in /home/jenkins/workspace/Robusta_robusta_develop-6EPNQBJK5BYEXOJV6L45MMZZGUIP7WO4Y6EGRUYNFFMRC7B2GL3A@tmp/durable-96fa79b7/pid: java.lang.NumberFormatException: For input string: ""
            00:20:12.769 Cannot contact kubernetes-ef39fe82c8a541be84bd780e4d7c1ddb-ce4d47fc96bfc: java.io.IOException: corrupted content in /home/jenkins/workspace/Robusta_robusta_develop-6EPNQBJK5BYEXOJV6L45MMZZGUIP7WO4Y6EGRUYNFFMRC7B2GL3A@tmp/durable-96fa79b7/pid: java.lang.NumberFormatException: For input string: ""
            {code}
            Add information from comments:
            The same kind of problem has occurred in my Jenkins running on GKE. This problem occurred by upgrading Pipeline Nodes and Processes Plugin from 2.8 to 2.9. I am confirming that this problem temporarily resolves by downgrading that plugin from 2.9 to 2.8.
            larslawoko Lars Lawoko made changes -
            Description I tried to narrow this bug down, but there isn't much information. We just upgraded to all newest plugins, but unfortunately we upgraded a lot at once, so no idea which one.

            This is spamming out logs every few seconds:
            {code:java}
            00:19:42.695 Cannot contact kubernetes-ef39fe82c8a541be84bd780e4d7c1ddb-ce4d47fc96bfc: java.io.IOException: corrupted content in /home/jenkins/workspace/Robusta_robusta_develop-6EPNQBJK5BYEXOJV6L45MMZZGUIP7WO4Y6EGRUYNFFMRC7B2GL3A@tmp/durable-96fa79b7/pid: java.lang.NumberFormatException: For input string: ""
            00:19:57.758 Cannot contact kubernetes-ef39fe82c8a541be84bd780e4d7c1ddb-ce4d47fc96bfc: java.io.IOException: corrupted content in /home/jenkins/workspace/Robusta_robusta_develop-6EPNQBJK5BYEXOJV6L45MMZZGUIP7WO4Y6EGRUYNFFMRC7B2GL3A@tmp/durable-96fa79b7/pid: java.lang.NumberFormatException: For input string: ""
            00:20:12.769 Cannot contact kubernetes-ef39fe82c8a541be84bd780e4d7c1ddb-ce4d47fc96bfc: java.io.IOException: corrupted content in /home/jenkins/workspace/Robusta_robusta_develop-6EPNQBJK5BYEXOJV6L45MMZZGUIP7WO4Y6EGRUYNFFMRC7B2GL3A@tmp/durable-96fa79b7/pid: java.lang.NumberFormatException: For input string: ""
            {code}
            Add information from comments:
            The same kind of problem has occurred in my Jenkins running on GKE. This problem occurred by upgrading Pipeline Nodes and Processes Plugin from 2.8 to 2.9. I am confirming that this problem temporarily resolves by downgrading that plugin from 2.9 to 2.8.
            I tried to narrow this bug down, but there isn't much information. We just upgraded to all newest plugins, but unfortunately we upgraded a lot at once, so no idea which one.

            This is spamming out logs every few seconds:
            {code:java}
            00:19:42.695 Cannot contact kubernetes-ef39fe82c8a541be84bd780e4d7c1ddb-ce4d47fc96bfc: java.io.IOException: corrupted content in /home/jenkins/workspace/Robusta_robusta_develop-6EPNQBJK5BYEXOJV6L45MMZZGUIP7WO4Y6EGRUYNFFMRC7B2GL3A@tmp/durable-96fa79b7/pid: java.lang.NumberFormatException: For input string: ""
            00:19:57.758 Cannot contact kubernetes-ef39fe82c8a541be84bd780e4d7c1ddb-ce4d47fc96bfc: java.io.IOException: corrupted content in /home/jenkins/workspace/Robusta_robusta_develop-6EPNQBJK5BYEXOJV6L45MMZZGUIP7WO4Y6EGRUYNFFMRC7B2GL3A@tmp/durable-96fa79b7/pid: java.lang.NumberFormatException: For input string: ""
            00:20:12.769 Cannot contact kubernetes-ef39fe82c8a541be84bd780e4d7c1ddb-ce4d47fc96bfc: java.io.IOException: corrupted content in /home/jenkins/workspace/Robusta_robusta_develop-6EPNQBJK5BYEXOJV6L45MMZZGUIP7WO4Y6EGRUYNFFMRC7B2GL3A@tmp/durable-96fa79b7/pid: java.lang.NumberFormatException: For input string: ""
            {code}
            Add information from comments:

            The same kind of problem has occurred in my Jenkins running on GKE. This problem occurred by upgrading Pipeline Nodes and Processes Plugin from 2.8 to 2.9. I am confirming that this problem temporarily resolves by downgrading that plugin from 2.9 to 2.8.

            BTW workflow-durable-task-step 2.9 does add this log message but it is just exposing a problem that was already there, and simply being suppressed unless you were running a sufficiently fine logger. The problem is that this code is seeing a file which is supposed to contain a number once created, whereas it is being created as empty for some reason.
            larslawoko Lars Lawoko made changes -
            Description I tried to narrow this bug down, but there isn't much information. We just upgraded to all newest plugins, but unfortunately we upgraded a lot at once, so no idea which one.

            This is spamming out logs every few seconds:
            {code:java}
            00:19:42.695 Cannot contact kubernetes-ef39fe82c8a541be84bd780e4d7c1ddb-ce4d47fc96bfc: java.io.IOException: corrupted content in /home/jenkins/workspace/Robusta_robusta_develop-6EPNQBJK5BYEXOJV6L45MMZZGUIP7WO4Y6EGRUYNFFMRC7B2GL3A@tmp/durable-96fa79b7/pid: java.lang.NumberFormatException: For input string: ""
            00:19:57.758 Cannot contact kubernetes-ef39fe82c8a541be84bd780e4d7c1ddb-ce4d47fc96bfc: java.io.IOException: corrupted content in /home/jenkins/workspace/Robusta_robusta_develop-6EPNQBJK5BYEXOJV6L45MMZZGUIP7WO4Y6EGRUYNFFMRC7B2GL3A@tmp/durable-96fa79b7/pid: java.lang.NumberFormatException: For input string: ""
            00:20:12.769 Cannot contact kubernetes-ef39fe82c8a541be84bd780e4d7c1ddb-ce4d47fc96bfc: java.io.IOException: corrupted content in /home/jenkins/workspace/Robusta_robusta_develop-6EPNQBJK5BYEXOJV6L45MMZZGUIP7WO4Y6EGRUYNFFMRC7B2GL3A@tmp/durable-96fa79b7/pid: java.lang.NumberFormatException: For input string: ""
            {code}
            Add information from comments:

            The same kind of problem has occurred in my Jenkins running on GKE. This problem occurred by upgrading Pipeline Nodes and Processes Plugin from 2.8 to 2.9. I am confirming that this problem temporarily resolves by downgrading that plugin from 2.9 to 2.8.

            BTW workflow-durable-task-step 2.9 does add this log message but it is just exposing a problem that was already there, and simply being suppressed unless you were running a sufficiently fine logger. The problem is that this code is seeing a file which is supposed to contain a number once created, whereas it is being created as empty for some reason.
            I tried to narrow this bug down, but there isn't much information. We just upgraded to all newest plugins, but unfortunately we upgraded a lot at once, so no idea which one.

            This is spamming out logs every few seconds:
            {code:java}
            00:19:42.695 Cannot contact kubernetes-ef39fe82c8a541be84bd780e4d7c1ddb-ce4d47fc96bfc: java.io.IOException: corrupted content in /home/jenkins/workspace/Robusta_robusta_develop-6EPNQBJK5BYEXOJV6L45MMZZGUIP7WO4Y6EGRUYNFFMRC7B2GL3A@tmp/durable-96fa79b7/pid: java.lang.NumberFormatException: For input string: ""
            00:19:57.758 Cannot contact kubernetes-ef39fe82c8a541be84bd780e4d7c1ddb-ce4d47fc96bfc: java.io.IOException: corrupted content in /home/jenkins/workspace/Robusta_robusta_develop-6EPNQBJK5BYEXOJV6L45MMZZGUIP7WO4Y6EGRUYNFFMRC7B2GL3A@tmp/durable-96fa79b7/pid: java.lang.NumberFormatException: For input string: ""
            00:20:12.769 Cannot contact kubernetes-ef39fe82c8a541be84bd780e4d7c1ddb-ce4d47fc96bfc: java.io.IOException: corrupted content in /home/jenkins/workspace/Robusta_robusta_develop-6EPNQBJK5BYEXOJV6L45MMZZGUIP7WO4Y6EGRUYNFFMRC7B2GL3A@tmp/durable-96fa79b7/pid: java.lang.NumberFormatException: For input string: ""
            {code}
            Add information from comments:

            The same kind of problem has occurred in my Jenkins running on GKE. This problem occurred by upgrading Pipeline Nodes and Processes Plugin from 2.8 to 2.9. I am confirming that this problem temporarily resolves by downgrading that plugin from 2.9 to 2.8.

            BTW workflow-durable-task-step 2.9 does add this log message but it is just exposing a problem that was already there, and simply being suppressed unless you were running a sufficiently fine logger. The problem is that this code is seeing a file which is supposed to contain a number once created, whereas it is being created as empty for some reason.

            Again the bug probably exists in all versions, it is only printed to the build log as of 2.9. You can add a FINE logger to org.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep to verify.
            Hide
            joan Joan Goyeau added a comment -

            Same here

            Show
            joan Joan Goyeau added a comment - Same here
            Hide
            jglick Jesse Glick added a comment -

            Needs to be fixed in the Kubernetes plugin.

            Show
            jglick Jesse Glick added a comment - Needs to be fixed in the Kubernetes plugin.
            jglick Jesse Glick made changes -
            Component/s workflow-api-plugin [ 21711 ]
            Component/s workflow-durable-task-step-plugin [ 21715 ]
            Hide
            iocanel Ioannis Canellos added a comment -

             

            I don't know why, but when this was initially implemented, the ContainerExecDecorator was receiving double the amount of `$` symbols. For example instead of `echo $$` it was getting `echo $$$$` and so on.

            So as a workaround, the decorator itself removed the excess $ sumbols. 

            Could the problem be related to that?

             

            Show
            iocanel Ioannis Canellos added a comment -   I don't know why, but when this was initially implemented, the ContainerExecDecorator was receiving double the amount of `$` symbols. For example instead of `echo $$` it was getting `echo $$$$` and so on. So as a workaround, the decorator itself removed the excess $ sumbols.  Could the problem be related to that?  
            Hide
            jwhitcraft Jon Whitcraft added a comment -

            Ioannis Canellos,

            I just tested your theory with this pipeline job and this is what it output in the log

            podTemplate(
                label: 'test',
                containers: [
                    containerTemplate(
                        name: 'nodejs',
                        image: 'node:alpine',
                        ttyEnabled: true,
                        command: 'cat',
                        args: ''
                    )
                ]
            ) {
                stage('test') {
                    node('test') {
                        env.MYTOOL_VERSION = '1.33'
                        container('nodejs') {
                          sh 'printenv'
                        }
                        echo env.MYTOOL_VERSION
                    }
                }
            }
            
             

            Results

            [Pipeline] podTemplate
            [Pipeline] {
            [Pipeline] stage
            [Pipeline] { (test)
            [Pipeline] node
            Running on jenkins-slave-4qzr4-jqq83 in /home/jenkins/workspace/test
            [Pipeline] {
            [Pipeline] container
            [Pipeline] {
            [Pipeline] sh
            [test] Running shell script
            Executing shell script inside container [nodejs] of pod [jenkins-slave-4qzr4-jqq83]
            Executing command: 
            /home/jenkins # cd "/home/jenkins/workspace/test"
            /home/jenkins/workspace/test # 
            /home/jenkins/workspace/test # exit
            [Pipeline] }
            [Pipeline] // container
            [Pipeline] }
            [Pipeline] // node
            [Pipeline] }
            [Pipeline] // stage
            [Pipeline] }
            [Pipeline] // podTemplate
            [Pipeline] End of Pipeline
            ERROR: script returned exit code -2
            Show
            jwhitcraft Jon Whitcraft added a comment - Ioannis Canellos , I just tested your theory with this pipeline job and this is what it output in the log podTemplate( label: 'test' , containers: [ containerTemplate( name: 'nodejs' , image: 'node:alpine' , ttyEnabled: true , command: 'cat' , args: '' ) ] ) { stage( 'test' ) { node( 'test' ) { env.MYTOOL_VERSION = '1.33' container( 'nodejs' ) { sh 'printenv' } echo env.MYTOOL_VERSION } } }   Results [Pipeline] podTemplate [Pipeline] { [Pipeline] stage [Pipeline] { (test) [Pipeline] node Running on jenkins-slave-4qzr4-jqq83 in /home/jenkins/workspace/test [Pipeline] { [Pipeline] container [Pipeline] { [Pipeline] sh [test] Running shell script Executing shell script inside container [nodejs] of pod [jenkins-slave-4qzr4-jqq83] Executing command: /home/jenkins # cd "/home/jenkins/workspace/test" /home/jenkins/workspace/test # /home/jenkins/workspace/test # exit [Pipeline] } [Pipeline] // container [Pipeline] } [Pipeline] // node [Pipeline] } [Pipeline] // stage [Pipeline] } [Pipeline] // podTemplate [Pipeline] End of Pipeline ERROR: script returned exit code -2
            Hide
            killdash9 Michael Andrews added a comment -

            Any update on this? Super annoying message. Floods the console output.

            Show
            killdash9 Michael Andrews added a comment - Any update on this? Super annoying message. Floods the console output.
            Hide
            jredl Jesse Redl added a comment -

            This is not be a solution for everyone, however this log noise seems to only happen when you have a pod template with more than one container running. I was working around another issue (https://issues.jenkins-ci.org/browse/JENKINS-40825) and by only having one container running (named jnlp) the log noise was also gone.

             

            Show
            jredl Jesse Redl added a comment - This is not be a solution for everyone, however this log noise seems to only happen when you have a pod template with more than one container running. I was working around another issue ( https://issues.jenkins-ci.org/browse/JENKINS-40825 ) and by only having one container running (named jnlp) the log noise was also gone.  
            Hide
            jglick Jesse Glick added a comment -

            workflow-durable-task-step PR 37 will reduce noise without addressing the underlying bug, probably somewhere in the Kubernetes plugin.

            Show
            jglick Jesse Glick added a comment - workflow-durable-task-step PR 37 will reduce noise without addressing the underlying bug, probably somewhere in the Kubernetes plugin.
            jglick Jesse Glick made changes -
            Remote Link This issue links to "workflow-durable-task-step PR 37 (Web Link)" [ 16209 ]
            csanchez Carlos Sanchez made changes -
            Assignee Carlos Sanchez [ csanchez ]
            csanchez Carlos Sanchez made changes -
            Status Open [ 1 ] In Progress [ 3 ]
            Hide
            killdash9 Michael Andrews added a comment -

            Jesse Redl Definitely not a solution for us. But good to know. Hopefully this will help Carlos Sanchez or Ioannis Canellos implement a fix. In the mean time, we did pull the workflow-durable-task-step PR 37 and merged it into a local build. It did lessen the number of time the message is logged. Hoping this issue is addressed soon.

            Show
            killdash9 Michael Andrews added a comment - Jesse Redl Definitely not a solution for us. But good to know. Hopefully this will help Carlos Sanchez or Ioannis Canellos implement a fix. In the mean time, we did pull the workflow-durable-task-step PR 37 and merged it into a local build. It did lessen the number of time the message is logged. Hoping this issue is addressed soon.
            Hide
            jglick Jesse Glick added a comment -

            we did pull the workflow-durable-task-step PR 37 and merged it into a local build

            Just update to 2.11.

            Show
            jglick Jesse Glick added a comment - we did pull the workflow-durable-task-step PR 37 and merged it into a local build Just update to 2.11.
            Show
            csanchez Carlos Sanchez added a comment - PR at  https://github.com/jenkinsci/kubernetes-plugin/pull/157  
            csanchez Carlos Sanchez made changes -
            Status In Progress [ 3 ] In Review [ 10005 ]
            csanchez Carlos Sanchez made changes -
            Link This issue blocks JENKINS-44150 [ JENKINS-44150 ]
            csanchez Carlos Sanchez made changes -
            Link This issue is duplicated by JENKINS-44152 [ JENKINS-44152 ]
            csanchez Carlos Sanchez made changes -
            Component/s kubernetes-pipeline-plugin [ 21630 ]
            csanchez Carlos Sanchez made changes -
            Link This issue blocks JENKINS-40825 [ JENKINS-40825 ]
            csanchez Carlos Sanchez made changes -
            Link This issue blocks JENKINS-39550 [ JENKINS-39550 ]
            csanchez Carlos Sanchez made changes -
            Link This issue blocks JENKINS-39664 [ JENKINS-39664 ]
            csanchez Carlos Sanchez made changes -
            Resolution Fixed [ 1 ]
            Status In Review [ 10005 ] Closed [ 6 ]
            0x89 Martin Sander made changes -
            Link This issue is related to JENKINS-46087 [ JENKINS-46087 ]
            jglick Jesse Glick made changes -
            Link This issue relates to JENKINS-61950 [ JENKINS-61950 ]

              People

              Assignee:
              csanchez Carlos Sanchez
              Reporter:
              larslawoko Lars Lawoko
              Votes:
              19 Vote for this issue
              Watchers:
              36 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: