-
Bug
-
Resolution: Fixed
-
Critical
-
None
-
Jenkins 1.529
OSX 10.8.4 (running as a VMWare Guest in VMWare Workstation 9.0.2 inside a Windows 7 Host)
also Jenkins 1.645, OSX 10.9, 10.10 (not vm)
also observed with Windows and Linux slaves.
-
-
ssh-slaves-1.31.1
I configured an OSX slave to use an SSH connection. I have an identical setup for a Linux slave. The Linux slave never hangs, but the OSX one does randomly every couple of days.
When the slave hangs, I see:
This node is being launched. See log for more details
When I click on more details I see an empty log (literally no characters) with a spinning wheel.
I'd like to disconnect the channel and try again. Unfortunately, there is no "disconnect" button, seemingly because the hang occurs too early in the connection phase.
The only way I found to fix this problem is restart Jenkins master. I believe this issue is high priority because:
- This hang occurs at least once a day (for over a week now).
- There is no known workaround.
- There is no way to recover except to restart the master node, which means that all running jobs have to be interrupted.
If you can add extra logging, I can try collection more information for you. Where do we get started?
- is duplicated by
-
JENKINS-47012 SSH Slaves launcher's afterDisconnect() is synchronous, it gets blocked by reconnect operations
-
- Resolved
-
- is related to
-
JENKINS-48613 SSH Slaves 1.23 can create lots of threads waiting for SSHLauncher lock in tearDownConnection
-
- Resolved
-
- links to
[JENKINS-19465] Slave hangs while being launched
Description |
Original:
I configured an OSX slave to use an SSH connection. I have an identical setup for a Linux slave. The Linux slave never hangs, but the OSX one does randomly every couple of days. When the slave hangs, I see: {code} This node is being launched. See log for more details {code} When I click on {{more details} I see an empty log (literally no characters) with a spinny wheel. I'd like to disconnect the channel and try again. Unfortunately, there is no "disconnect" button, seemingly because the hang occurs too early in the connection phase. The only way I found to fix this problem is restart Jenkins master. I believe this issue is high priority because: # There is no known workaround. # The problem occurs randomly. # There is no way to recover except to restart the master node, which means that all running jobs have to be interrupted. If you can add extra logging, I can try collection more information for you. Where do we get started? |
New:
I configured an OSX slave to use an SSH connection. I have an identical setup for a Linux slave. The Linux slave never hangs, but the OSX one does randomly every couple of days. When the slave hangs, I see: {code} This node is being launched. See log for more details {code} When I click on {{more details}} I see an empty log (literally no characters) with a spinning wheel. I'd like to disconnect the channel and try again. Unfortunately, there is no "disconnect" button, seemingly because the hang occurs too early in the connection phase. The only way I found to fix this problem is restart Jenkins master. I believe this issue is high priority because: # There is no known workaround. # The problem occurs randomly. # There is no way to recover except to restart the master node, which means that all running jobs have to be interrupted. If you can add extra logging, I can try collection more information for you. Where do we get started? |
Description |
Original:
I configured an OSX slave to use an SSH connection. I have an identical setup for a Linux slave. The Linux slave never hangs, but the OSX one does randomly every couple of days. When the slave hangs, I see: {code} This node is being launched. See log for more details {code} When I click on {{more details}} I see an empty log (literally no characters) with a spinning wheel. I'd like to disconnect the channel and try again. Unfortunately, there is no "disconnect" button, seemingly because the hang occurs too early in the connection phase. The only way I found to fix this problem is restart Jenkins master. I believe this issue is high priority because: # There is no known workaround. # The problem occurs randomly. # There is no way to recover except to restart the master node, which means that all running jobs have to be interrupted. If you can add extra logging, I can try collection more information for you. Where do we get started? |
New:
I configured an OSX slave to use an SSH connection. I have an identical setup for a Linux slave. The Linux slave never hangs, but the OSX one does randomly every couple of days. When the slave hangs, I see: {code} This node is being launched. See log for more details {code} When I click on {{more details}} I see an empty log (literally no characters) with a spinning wheel. I'd like to disconnect the channel and try again. Unfortunately, there is no "disconnect" button, seemingly because the hang occurs too early in the connection phase. The only way I found to fix this problem is restart Jenkins master. I believe this issue is high priority because: # This hang occurs at least once a day (for over a week now). # There is no known workaround. # There is no way to recover except to restart the master node, which means that all running jobs have to be interrupted. If you can add extra logging, I can try collection more information for you. Where do we get started? |
According to the node's "load statistics" it was running fine until exactly 9am. Then, for an unknown reason, the node got disconnected. When I got around to looking at Jenkins later that day I found the node in the "This node is being launched" state again... hanging forever.
I'd like to avoid having to restart the Jenkins server once a day (or potentially multiple times a day) to fix the OSX slave. Any ideas?
I see an open ssh tunnel from master to the OSX machine but I see no proof that Jenkins is running (according to both "jps" and "ps").
Is there a way for me to find out why the node got disconnected (a log that spans multiple connections/disconnections) and what it's blocked on trying to reconnect?