-
Bug
-
Resolution: Unresolved
-
Minor
-
None
-
1.36 ec2 plugin
Sometimes when AWS terminates one of our spot instances, it gets into a state where Jenkins still thinks its a valid, available executor, but it is in the process of shutting down and therefore cannot fulfill any requests. When this occurs, our entire backlog of tests rapidly flushes through that executor, failing all of them.
Sometimes the executor is totally broken like so:
00:00:00.002 Started by remote host 140.211.10.27 00:00:00.002 [EnvInject] - Loading node environment variables. 00:00:27.357 FATAL: java.io.IOException: Unexpected termination of the channel 00:00:27.358 hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel 00:00:27.359 at hudson.remoting.Request.abort(Request.java:303) 00:00:27.360 at hudson.remoting.Channel.terminate(Channel.java:863) 00:00:27.360 at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:92) 00:00:27.360 at ......remote call to Testrunner (sir-tdd89gzm)(Native Method) 00:00:27.361 at hudson.remoting.Channel.attachCallSiteStackTrace(Channel.java:1433) 00:00:27.361 at hudson.remoting.Request.call(Request.java:172) 00:00:27.361 at hudson.remoting.Channel.call(Channel.java:796) 00:00:27.362 at hudson.FilePath.act(FilePath.java:1102) 00:00:27.362 at org.jenkinsci.plugins.envinject.service.EnvironmentVariablesNodeLoader.gatherEnvironmentVariablesNode(EnvironmentVariablesNodeLoader.java:48) 00:00:27.363 at org.jenkinsci.plugins.envinject.EnvInjectListener.loadEnvironmentVariablesNode(EnvInjectListener.java:80) 00:00:27.363 at org.jenkinsci.plugins.envinject.EnvInjectListener.setUpEnvironment(EnvInjectListener.java:42) 00:00:27.364 at hudson.model.AbstractBuild$AbstractBuildExecution.createLauncher(AbstractBuild.java:572) 00:00:27.364 at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:492) 00:00:27.365 at hudson.model.Run.execute(Run.java:1720) 00:00:27.365 at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43) 00:00:27.365 at hudson.model.ResourceController.execute(ResourceController.java:98) 00:00:27.365 at hudson.model.Executor.run(Executor.java:404) 00:00:27.366 Caused by: java.io.IOException: Unexpected termination of the channel 00:00:27.366 at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:73) 00:00:27.367 Caused by: java.io.EOFException 00:00:27.367 at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2335) 00:00:27.367 at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:2804) 00:00:27.368 at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:802) 00:00:27.368 at java.io.ObjectInputStream.<init>(ObjectInputStream.java:299) 00:00:27.368 at hudson.remoting.ObjectInputStreamEx.<init>(ObjectInputStreamEx.java:48) 00:00:27.369 at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:34) 00:00:27.369 at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:59) 00:00:27.370 ERROR: Step ‘Publish Checkstyle analysis results’ failed: no workspace for drupal_patches #8845 00:00:27.371 ERROR: Step ‘Archive the artifacts’ failed: no workspace for drupal_patches #8845 00:00:27.372 Checking console output 00:00:27.373 ERROR: Step ‘Publish JUnit test result report’ failed: no workspace for drupal_patches #8845 00:00:27.396 Finished: FAILURE
Other times it seems like the instance is shutting down, which kills the docker daemon before it kills the jenkins executor availability:
00:00:00.001 Started by remote host 140.211.10.27 00:00:00.001 [EnvInject] - Loading node environment variables. 00:00:00.007 Building remotely on Testrunner (sir-yjpgbk4n) (testrunner) in workspace /var/lib/drupalci/workspace 00:00:00.018 [workspace] $ /bin/bash /tmp/hudson6328886945950906962.sh 00:00:00.124 Cannot connect to the Docker daemon. Is the docker daemon running on this host? 00:00:00.141 Cannot connect to the Docker daemon. Is the docker daemon running on this host? 00:00:00.151 ++ id 00:00:00.152 uid=1001(testbot) gid=1001(testbot) groups=1001(testbot),27(sudo),999(docker) 00:00:00.153 ++ export COMPOSER_CACHE_DIR=/opt/drupalci/composer-cache 00:00:00.153 ++ COMPOSER_CACHE_DIR=/opt/drupalci/composer-cache 00:00:00.153 ++ echo https://www.drupal.org/pift-ci-job/635848 00:00:00.153 https://www.drupal.org/pift-ci-job/635848 00:00:00.154 ++ curl -w '\n' -s http://169.254.169.254/latest/meta-data/instance-type 00:00:00.159 cc2.8xlarge 00:00:00.159 ++ curl -w '\n' -s http://169.254.169.254/latest/meta-data/ami-id 00:00:00.165 ami-3c42c35c 00:00:00.166 ++ curl -w '\n' -s http://169.254.169.254/latest/meta-data/public-ipv4 00:00:00.172 54.212.244.41 00:00:00.172 ++ env 00:00:00.172 ++ grep DCI 00:00:00.173 DCI_CS_CoderVersion=8.2.8 00:00:00.173 DCI_PHPVersion=php-5.6-apache:production 00:00:00.173 DCI_JobType=simpletest 00:00:00.174 DCI_CoreBranch=8.4.x 00:00:00.174 DCI_Patch=rpc_endpoint_to_reset-2847708-24.patch,. 00:00:00.174 DCI_Debug=FALSE 00:00:00.174 DCI_ES_LintFailsTest=TRUE 00:00:00.174 DCI_Fetch=https://www.drupal.org/files/issues/rpc_endpoint_to_reset-2847708-24.patch,. 00:00:00.175 DCI_Concurrency=31 00:00:00.175 DCI_CoreRepository=git://git.drupal.org/project/drupal.git 00:00:00.175 DCI_DBVersion=mysql-5.5 00:00:00.175 ++ env 00:00:00.175 ++ grep -v DCI 00:00:00.175 BUILD_URL=http://dispatcher-origin.drupalci.aws:8080/job/drupal_patches/8786/ 00:00:00.176 SHELL=/bin/bash 00:00:00.176 HUDSON_SERVER_COOKIE=f9f94f9baaa33b04 00:00:00.176 SSH_CLIENT=172.31.42.62 35896 22 00:00:00.176 BUILD_TAG=jenkins-drupal_patches-8786 00:00:00.177 ROOT_BUILD_CAUSE=REMOTECAUSE 00:00:00.177 JOB_URL=http://dispatcher-origin.drupalci.aws:8080/job/drupal_patches/ 00:00:00.177 WORKSPACE=/var/lib/drupalci/workspace 00:00:00.177 USER=testbot 00:00:00.177 ROOT_BUILD_CAUSE_REMOTECAUSE=true 00:00:00.178 COMPOSER_CACHE_DIR=/opt/drupalci/composer-cache 00:00:00.178 JENKINS_HOME=/usr/local/jenkins 00:00:00.178 MAIL=/var/mail/testbot 00:00:00.178 PATH=/usr/local/bin:/usr/bin:/bin:/usr/games 00:00:00.178 PWD=/var/lib/drupalci/workspace 00:00:00.178 HUDSON_URL=http://dispatcher-origin.drupalci.aws:8080/ 00:00:00.179 LANG=en_US.UTF-8 00:00:00.179 JOB_NAME=drupal_patches 00:00:00.179 BUILD_CAUSE_REMOTECAUSE=true 00:00:00.179 BUILD_DISPLAY_NAME=#8786 00:00:00.179 BUILD_ID=8786 00:00:00.179 BUILD_CAUSE=REMOTECAUSE 00:00:00.179 JENKINS_URL=http://dispatcher-origin.drupalci.aws:8080/ 00:00:00.180 Drupal_JobID=https://www.drupal.org:635848 00:00:00.180 JOB_BASE_NAME=drupal_patches 00:00:00.180 SHLVL=3 00:00:00.180 HOME=/home/testbot 00:00:00.180 EXECUTOR_NUMBER=0 00:00:00.180 JENKINS_SERVER_COOKIE=f9f94f9baaa33b04 00:00:00.181 NODE_LABELS=Testrunner (sir-yjpgbk4n) testrunner 00:00:00.181 LOGNAME=testbot 00:00:00.181 SSH_CONNECTION=172.31.42.62 35896 172.31.0.168 22 00:00:00.181 HUDSON_HOME=/usr/local/jenkins 00:00:00.181 NODE_NAME=Testrunner (sir-yjpgbk4n) 00:00:00.181 BUILD_NUMBER=8786 00:00:00.182 Testrunner_Branch=production 00:00:00.182 HUDSON_COOKIE=90c8c0be-3081-4c6c-a8c6-e79cc8e16f5c 00:00:00.182 _=/usr/bin/env 00:00:00.182 ++ cd /opt/drupalci/testrunner 00:00:00.182 ++ git fetch --all --tags 00:00:00.183 Fetching origin 00:00:00.275 ++ git checkout production 00:00:00.278 Already on 'production' 00:00:00.278 Your branch is up-to-date with 'origin/production'. 00:00:00.278 ++ git pull --rebase 00:00:00.377 Current branch production is up to date. 00:00:00.379 ++ docker pull drupalci/php-5.6-apache:production 00:00:00.387 Warning: failed to get default registry endpoint from daemon (Cannot connect to the Docker daemon. Is the docker daemon running on this host?). Using system default: https://index.docker.io/v1/ 00:00:00.388 Cannot connect to the Docker daemon. Is the docker daemon running on this host? 00:00:00.394 Build step 'Execute shell' marked build as failure 00:00:00.458 [CHECKSTYLE] Collecting checkstyle analysis files... 00:00:00.501 [CHECKSTYLE] Finding all files that match the pattern jenkins-drupal_patches-8786/artifacts/*/checkstyle.xml 00:00:00.504 [CHECKSTYLE] Computing warning deltas based on reference build #8777 00:00:00.504 Archiving artifacts 00:00:00.507 Checking console output 00:00:00.507 Recording test results 00:00:00.510 ERROR: Step ‘Publish JUnit test result report’ failed: No test report files were found. Configuration error? 00:00:00.533 Finished: FAILURE