Loading...

Type: New Feature
Resolution: Fixed
Priority: Critical
Component/s: workflow-durable-task-step-plugin
Labels:
- issue-exported-to-github

While my pipeline was running, the node that was executing logic terminated. I see this at the bottom of my console output:

Cannot contact ip-172-31-242-8.us-west-2.compute.internal: java.io.IOException: remote file operation failed: /ebs/jenkins/workspace/common-pipelines-nodeploy at hudson.remoting.Channel@48503f20:ip-172-31-242-8.us-west-2.compute.internal: hudson.remoting.ChannelClosedException: Channel "unknown": Remote call on ip-172-31-242-8.us-west-2.compute.internal failed. The channel is closing down or has closed down

There's a spinning arrow below it.

I have a cron script that uses the Jenkins master CLI to remove nodes which have stopped responding. When I examine this node's page in my Jenkins website, it looks like the node is still running that job and i see an orange label that says "Feb 22, 2018 5:16:02 PM Node is being removed".

I'm wondering what would be a better way to say "If the channel closes down, retry the work on another node with the same label?

Things seem stuck. Please advise.

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

Hide
grub.remoting.logs.zip
2018-07-04 08:19
3 kB
Federico Naum
Extracting archive...
Show
grub.remoting.logs.zip
2018-07-04 08:19
3 kB
Federico Naum
grubSystemInformation.html
2018-07-04 08:19
67 kB
Federico Naum
image-2018-02-22-17-27-31-541.png
2018-02-23 01:27
56 kB
Jon B
image-2018-02-22-17-28-03-053.png
2018-02-23 01:28
30 kB
Jon B
JavaMelodyGrubHeapDump_4_07_18.pdf
2018-07-04 08:19
220 kB
Federico Naum
JavaMelodyNodeGrubThreads_4_07_18.pdf
2018-07-04 08:19
9 kB
Federico Naum
Hide
jenkins_agent_devbuild9_remoting_logs.zip
2018-06-29 01:31
4 kB
Federico Naum
Extracting archive...
Show
jenkins_agent_devbuild9_remoting_logs.zip
2018-06-29 01:31
4 kB
Federico Naum
jenkins_Agent_devbuild9_System_Information.html
2018-06-29 01:31
66 kB
Federico Naum
jenkins_agents_Thread_dump.html
2018-06-29 01:31
172 kB
Federico Naum
Hide
jenkins_support_2018-06-29_01.14.18.zip
2018-06-29 01:31
1.26 MB
Federico Naum
Extracting archive...
Show
jenkins_support_2018-06-29_01.14.18.zip
2018-06-29 01:31
1.26 MB
Federico Naum
jenkins.log
2018-07-04 08:19
984 kB
Federico Naum
jobConsoleOutput.txt
2018-07-04 08:19
12 kB
Federico Naum
jobConsoleOutput.txt
2018-07-04 08:18
12 kB
Federico Naum
MonitoringJavaelodyOnNodes.html
2018-07-04 08:19
44 kB
Federico Naum
NetworkAndMachineStats.png
2018-07-04 08:19
224 kB
Federico Naum
Hide
slaveLogInMaster.grub.zip
2018-07-04 08:19
8 kB
Federico Naum
Extracting archive...
Show
slaveLogInMaster.grub.zip
2018-07-04 08:19
8 kB
Federico Naum
Hide
support_2018-07-04_07.35.22.zip
2018-07-04 08:19
956 kB
Federico Naum
Extracting archive...
Show
support_2018-07-04_07.35.22.zip
2018-07-04 08:19
956 kB
Federico Naum
threadDump.txt
2018-11-17 22:17
98 kB
Amir Barkal
Thread dump [Jenkins].html
2018-07-04 08:19
219 kB
Federico Naum

causes

JENKINS-73618 ws step re-provisions already provisioned workspace if controller restarted in midway during build

Open

JENKINS-69936 PWD returning wrong path

Resolved

JENKINS-70528 node / dir / node on same agent sets PWD to that of dir rather than @2 workspace

Resolved

depends on

JENKINS-30383 SynchronousNonBlockingStepExecution should allow restart of idempotent steps

Resolved

is duplicated by

JENKINS-49241 pipeline hangs if slave node momentarily disconnects

Open

JENKINS-47868 Pipeline durability hang when slave node disconnected

Reopened

JENKINS-43781 Quickly detecting and restarting a job if the job's slave disconnects

Resolved

JENKINS-57675 Pipeline steps running forever when executor fails

Resolved

JENKINS-47561 Pipelines wait indefinitely for kubernetes slaves to come back online

Closed

JENKINS-43607 Jenkins pipeline not aborted when the machine running docker container goes offline

Resolved

JENKINS-56673 Better handling of ChannelClosedException in Declarative pipeline

Resolved

is related to

JENKINS-41854 Contextualize a fresh FilePath after an agent reconnection

Resolved

relates to

JENKINS-36013 Automatically abort ExecutorPickle rehydration from an ephemeral node

Closed

JENKINS-61387 SlaveComputer not cleaned up after the channel is closed

Open

JENKINS-67285 if jenkins-agent pod has removed fail fast jobs that use this jenkins-agent pod

Open

JENKINS-71113 AgentErrorCondition should handle "missing workspace" error

Open

JENKINS-60507 Pipeline stuck when allocating machine | node block appears to be neither running nor scheduled

Reopened

JENKINS-59340 Pipeline hangs when Agent pod is Terminated

Resolved

JENKINS-35246 Kubernetes agents not getting deleted in Jenkins after pods are deleted

Resolved

JENKINS-70333 Default for Declarative agent retries

Open

JENKINS-68963 build logs should contain if a spot agent is terminated

Open

links to

CloudBees-internal issue

jenkins-infra/docker-jenkins-weekly #512

jenkins-infra/pipeline-library #405

kubernetes-plugin #1083

pipeline-model-definition-plugin #533

workflow-api-plugin #217

workflow-basic-steps-plugin #195

workflow-durable-task-step-plugin #180

workflow-durable-task-step-plugin #254

(6 is duplicated by, 1 is related to, 9 relates to, 9 links to)

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates