During the day, I run lots of Jenkins slaves. During the evening, I use AWS to autoscale down the number of slaves I'm using. AWS simply terminates the instances. Jenkins probably would call this a "channel disconnect".
I noticed that any jobs which are running when the slave is killed off hang for a really long time. For example, the link below shows a job which had a 10 minute timeout set. I kill the job off at the 24 second mark, but the job hangs up until the 10 minute mark where Jenkins timeout plugin detects a timeout.. but then it spends the next 7 minutes hanging until Jenkins realizes the channel is disconnected.
https://gist.github.com/blockjon/6358b4124935fa4e72ba8a7d5bd12291
What's a better way to have jobs be stopped and/or restarted if the slave they are running on is disconnected quickly?
Desired Behavior:
Jenkins detects the channel is disconnected within 30 seconds. It proceeds to restart the job via another healthy node.
- duplicates
-
JENKINS-49707 Auto retry for elastic agents after channel closure
-
- Resolved
-
[JENKINS-43781] Quickly detecting and restarting a job if the job's slave disconnects
Component/s | New: remoting [ 15489 ] |
Description |
Original:
During the day, I'd like to run lots of Jenkins slaves. During the evening, I'd like to autoscale down the number of slaves I'm using. AWS autoscaling can easily allow me to kill off a certain number of slaves and so I use that. I noticed that any jobs which are running when the slave is killed off hang for a really long time. For example, this job has a 10 minute timeout set, and even after the timeout is reached, it waits another 7 minutes before Jenkins realizes the channel is disconnected. [https://gist.github.com/blockjon/6358b4124935fa4e72ba8a7d5bd12291] What's a better way to have jobs be stopped and/or restarted if the slave they are running on is disconnected? |
New:
During the day, I run lots of Jenkins slaves. During the evening, I'd like to autoscale down the number of slaves I'm using. AWS autoscaling can easily allow me to kill off a certain number of slaves and so I use that. I noticed that any jobs which are running when the slave is killed off hang for a really long time. For example, this job has a 10 minute timeout set, and even after the timeout is reached, it waits another 7 minutes before Jenkins realizes the channel is disconnected. [https://gist.github.com/blockjon/6358b4124935fa4e72ba8a7d5bd12291] What's a better way to have jobs be stopped and/or restarted if the slave they are running on is disconnected? |
Description |
Original:
During the day, I run lots of Jenkins slaves. During the evening, I'd like to autoscale down the number of slaves I'm using. AWS autoscaling can easily allow me to kill off a certain number of slaves and so I use that. I noticed that any jobs which are running when the slave is killed off hang for a really long time. For example, this job has a 10 minute timeout set, and even after the timeout is reached, it waits another 7 minutes before Jenkins realizes the channel is disconnected. [https://gist.github.com/blockjon/6358b4124935fa4e72ba8a7d5bd12291] What's a better way to have jobs be stopped and/or restarted if the slave they are running on is disconnected? |
New:
During the day, I run lots of Jenkins slaves. During the evening, I use AWS to autoscale down the number of slaves I'm using. AWS simply terminates the instances. Jenkins probably would call this a channel disconnect. I noticed that any jobs which are running when the slave is killed off hang for a really long time. For example, the link below shows a job which had a 10 minute timeout set. I kill the job off at the 24 second mark, but the job hangs up until the 10 minute mark where Jenkins timeout plugin detects a timeout.. but then it spends the next 7 minutes hanging until Jenkins realizes the channel is disconnected. [https://gist.github.com/blockjon/6358b4124935fa4e72ba8a7d5bd12291] What's a better way to have jobs be stopped and/or restarted if the slave they are running on is disconnected? |
Description |
Original:
During the day, I run lots of Jenkins slaves. During the evening, I use AWS to autoscale down the number of slaves I'm using. AWS simply terminates the instances. Jenkins probably would call this a channel disconnect. I noticed that any jobs which are running when the slave is killed off hang for a really long time. For example, the link below shows a job which had a 10 minute timeout set. I kill the job off at the 24 second mark, but the job hangs up until the 10 minute mark where Jenkins timeout plugin detects a timeout.. but then it spends the next 7 minutes hanging until Jenkins realizes the channel is disconnected. [https://gist.github.com/blockjon/6358b4124935fa4e72ba8a7d5bd12291] What's a better way to have jobs be stopped and/or restarted if the slave they are running on is disconnected? |
New:
During the day, I run lots of Jenkins slaves. During the evening, I use AWS to autoscale down the number of slaves I'm using. AWS simply terminates the instances. Jenkins probably would call this a channel disconnect. I noticed that any jobs which are running when the slave is killed off hang for a really long time. For example, the link below shows a job which had a 10 minute timeout set. I kill the job off at the 24 second mark, but the job hangs up until the 10 minute mark where Jenkins timeout plugin detects a timeout.. but then it spends the next 7 minutes hanging until Jenkins realizes the channel is disconnected. [https://gist.github.com/blockjon/6358b4124935fa4e72ba8a7d5bd12291] What's a better way to have jobs be stopped and/or restarted if the slave they are running on is disconnected _quickly_? |
Description |
Original:
During the day, I run lots of Jenkins slaves. During the evening, I use AWS to autoscale down the number of slaves I'm using. AWS simply terminates the instances. Jenkins probably would call this a channel disconnect. I noticed that any jobs which are running when the slave is killed off hang for a really long time. For example, the link below shows a job which had a 10 minute timeout set. I kill the job off at the 24 second mark, but the job hangs up until the 10 minute mark where Jenkins timeout plugin detects a timeout.. but then it spends the next 7 minutes hanging until Jenkins realizes the channel is disconnected. [https://gist.github.com/blockjon/6358b4124935fa4e72ba8a7d5bd12291] What's a better way to have jobs be stopped and/or restarted if the slave they are running on is disconnected _quickly_? |
New:
During the day, I run lots of Jenkins slaves. During the evening, I use AWS to autoscale down the number of slaves I'm using. AWS simply terminates the instances. Jenkins probably would call this a "channel disconnect". I noticed that any jobs which are running when the slave is killed off hang for a really long time. For example, the link below shows a job which had a 10 minute timeout set. I kill the job off at the 24 second mark, but the job hangs up until the 10 minute mark where Jenkins timeout plugin detects a timeout.. but then it spends the next 7 minutes hanging until Jenkins realizes the channel is disconnected. [https://gist.github.com/blockjon/6358b4124935fa4e72ba8a7d5bd12291] What's a better way to have jobs be stopped and/or restarted if the slave they are running on is disconnected _quickly_? |
Description |
Original:
During the day, I run lots of Jenkins slaves. During the evening, I use AWS to autoscale down the number of slaves I'm using. AWS simply terminates the instances. Jenkins probably would call this a "channel disconnect". I noticed that any jobs which are running when the slave is killed off hang for a really long time. For example, the link below shows a job which had a 10 minute timeout set. I kill the job off at the 24 second mark, but the job hangs up until the 10 minute mark where Jenkins timeout plugin detects a timeout.. but then it spends the next 7 minutes hanging until Jenkins realizes the channel is disconnected. [https://gist.github.com/blockjon/6358b4124935fa4e72ba8a7d5bd12291] What's a better way to have jobs be stopped and/or restarted if the slave they are running on is disconnected _quickly_? |
New:
During the day, I run lots of Jenkins slaves. During the evening, I use AWS to autoscale down the number of slaves I'm using. AWS simply terminates the instances. Jenkins probably would call this a "channel disconnect". I noticed that any jobs which are running when the slave is killed off hang for a really long time. For example, the link below shows a job which had a 10 minute timeout set. I kill the job off at the 24 second mark, but the job hangs up until the 10 minute mark where Jenkins timeout plugin detects a timeout.. but then it spends the next 7 minutes hanging until Jenkins realizes the channel is disconnected. [https://gist.github.com/blockjon/6358b4124935fa4e72ba8a7d5bd12291] What's a better way to have jobs be stopped and/or restarted if the slave they are running on is disconnected _quickly_? *Desired Behavior:* Jenkins detects the channel is hung within 30 seconds. It proceeds to restart the job via another healthy node. |
Description |
Original:
During the day, I run lots of Jenkins slaves. During the evening, I use AWS to autoscale down the number of slaves I'm using. AWS simply terminates the instances. Jenkins probably would call this a "channel disconnect". I noticed that any jobs which are running when the slave is killed off hang for a really long time. For example, the link below shows a job which had a 10 minute timeout set. I kill the job off at the 24 second mark, but the job hangs up until the 10 minute mark where Jenkins timeout plugin detects a timeout.. but then it spends the next 7 minutes hanging until Jenkins realizes the channel is disconnected. [https://gist.github.com/blockjon/6358b4124935fa4e72ba8a7d5bd12291] What's a better way to have jobs be stopped and/or restarted if the slave they are running on is disconnected _quickly_? *Desired Behavior:* Jenkins detects the channel is hung within 30 seconds. It proceeds to restart the job via another healthy node. |
New:
During the day, I run lots of Jenkins slaves. During the evening, I use AWS to autoscale down the number of slaves I'm using. AWS simply terminates the instances. Jenkins probably would call this a "channel disconnect". I noticed that any jobs which are running when the slave is killed off hang for a really long time. For example, the link below shows a job which had a 10 minute timeout set. I kill the job off at the 24 second mark, but the job hangs up until the 10 minute mark where Jenkins timeout plugin detects a timeout.. but then it spends the next 7 minutes hanging until Jenkins realizes the channel is disconnected. [https://gist.github.com/blockjon/6358b4124935fa4e72ba8a7d5bd12291] What's a better way to have jobs be stopped and/or restarted if the slave they are running on is disconnected _quickly_? *Desired Behavior:* Jenkins detects the channel is disconnected within 30 seconds. It proceeds to restart the job via another healthy node. |
Link |
New:
This issue duplicates |
Resolution | New: Duplicate [ 3 ] | |
Status | Original: Open [ 1 ] | New: Resolved [ 5 ] |