Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-64109

Disconnected SSH slave agents automatically restart

    XMLWordPrintable

Details

    Description

      I have a Jenkins master running on Windows 10, and Jenkins slave agents also running on Windows 10.

      The communicate and connect fine. Jobs run. Everything is nice.

      However, if I want to take a slave agent offline, I cannot, it automatically restarts. It does not stay offline.

      The steps to reproduce this issue are as follows

      1. Jenkins -> Manage Jenkins -> Manage Nodes and Clouds
      2. Select a node
      3. Select Disconnect
        The node disconnects.
      4. Wait about 30 seconds
        The node automatically reconnects.
        #

      Attachments

        Activity

          >Ultimately I was trying to get the remote agents to completely stop and then restart so they would come up with newly set environment variables. I was trying to disconnect the nodes so the SSH session would completely disconnect, and then I would restart.
          However, I found that with the "quick restart" the SSH session seems to be maintained.

          I am not sure to get you, Do you mean that after adding new environment variables to the Agent configuration and disconnect the agent then when it connects again those variables are not there? weird, when you disconnect the agent you close the SSH, and when you connect you open a new one.

          ifernandezcalvo Ivan Fernandez Calvo added a comment - >Ultimately I was trying to get the remote agents to completely stop and then restart so they would come up with newly set environment variables. I was trying to disconnect the nodes so the SSH session would completely disconnect, and then I would restart. However, I found that with the "quick restart" the SSH session seems to be maintained. I am not sure to get you, Do you mean that after adding new environment variables to the Agent configuration and disconnect the agent then when it connects again those variables are not there? weird, when you disconnect the agent you close the SSH, and when you connect you open a new one.
          rocha_stratovan John Rocha added a comment -

          I added environment variables to the slave via the slave CLI, its on the slave machine itself. We have commands that need environment variable settings and we want the ability for running commands natively on the machine to behave the same as through Jenkins. Hence many of our variables are deployed on the machine, not via Jenkins configuration.

          When I was using agents as a service I would have to:

          1. login to remote machine
          2. set environment variable
          3. restart the Jenkins agent service

          When the Jenkins agent service restarted it would slurp up the newly set environment variables.

           

          In this case I tried:

          1. login to remote machines
          2. set environment variables
          3. disable the agent from the master interface

          Since the agents are now started via SSH it's not as simple as just restarting the agent service. I now need to cause the entire SSH session to the Windows machine to terminate and then restart, since environment variables are only visible to newly started processes.

          Hmmm... but now that I more carefully browse the connection log. I agree with you. Disconnecting also terminates the SSH connection.

           

          [11/02/20 13:34:45] [SSH] Checking java version of java
          [11/02/20 13:34:46] [SSH] java -version returned 1.8.0_272.
          [11/02/20 13:34:46] [SSH] Starting sftp client.
          [11/02/20 13:34:46] [SSH] Copying latest remoting.jar...
          Source agent hash is D866F0B482DB94F38E49B26B465D5DB5. Installed agent hash is D866F0B482DB94F38E49B26B465D5DB5
          Verified agent jar. No update is necessary.
          Expanded the channel window size to 4MB
          [11/02/20 13:34:48] [SSH] Starting agent process: cd "c:\jenkins" && java  -jar remoting.jar -workDir c:\jenkins -jar-cache c:\jenkins/remoting/jarCache
          Nov 02, 2020 1:34:48 PM org.jenkinsci.remoting.engine.WorkDirManager initializeWorkDir
          INFO: Using c:\jenkins\remoting as a remoting work directory
          Nov 02, 2020 1:34:48 PM org.jenkinsci.remoting.engine.WorkDirManager setupLogging
          INFO: Both error and output logs will be printed to c:\jenkins\remoting
          <===[JENKINS REMOTING CAPACITY]===>channel started
          Remoting version: 4.5
          This is a Windows agent
          Agent successfully connected and online
          channel stoppedConnection terminated          <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
                                                        <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
          Agent JVM has terminated. Exit code=0         <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
          [11/03/20 10:22:01] [SSH] Connection closed.
          SSHLauncher{host='test-64', port=22, credentialsId='74dce9aa-167d-4df1-9269-d65e38820332', jvmOptions='', javaPath='', prefixStartSlaveCmd='', suffixStartSlaveCmd='', launchTimeoutSeconds=60, maxNumRetries=10, retryWaitTime=15, sshHostKeyVerificationStrategy=hudson.plugins.sshslaves.verifiers.KnownHostsFileKeyVerificationStrategy, tcpNoDelay=true, trackCredentials=true}
          [11/03/20 10:22:44] [SSH] Opening SSH connection to test-64:22.
          Searching for test-64 in C:\Users\Build\.ssh\known_hosts
          Searching for test-64:22 in C:\Users\Build\.ssh\known_hosts
          [11/03/20 10:22:45] [SSH] SSH host key matches key in Known Hosts file. Connection will be allowed.
          [11/03/20 10:22:45] [SSH] Authentication successful.
          [11/03/20 10:22:45] [SSH] The remote user's environment is:
          


          Now I'm starting to wonder if the SSH Server service needs to be restarted so that it slurps up the new environment variables, so that new SSH clients going through the Server will get the new vars?

          rocha_stratovan John Rocha added a comment - I added environment variables to the slave via the slave CLI, its on the slave machine itself. We have commands that need environment variable settings and we want the ability for running commands natively on the machine to behave the same as through Jenkins. Hence many of our variables are deployed on the machine, not via Jenkins configuration. When I was using agents as a service I would have to: login to remote machine set environment variable restart the Jenkins agent service When the Jenkins agent service restarted it would slurp up the newly set environment variables.   In this case I tried: login to remote machines set environment variables disable the agent from the master interface Since the agents are now started via SSH it's not as simple as just restarting the agent service. I now need to cause the entire SSH session to the Windows machine to terminate and then restart, since environment variables are only visible to newly started processes. Hmmm... but now that I more carefully browse the connection log. I agree with you. Disconnecting also terminates the SSH connection.   [11/02/20 13:34:45] [SSH] Checking java version of java [11/02/20 13:34:46] [SSH] java -version returned 1.8.0_272. [11/02/20 13:34:46] [SSH] Starting sftp client. [11/02/20 13:34:46] [SSH] Copying latest remoting.jar... Source agent hash is D866F0B482DB94F38E49B26B465D5DB5. Installed agent hash is D866F0B482DB94F38E49B26B465D5DB5 Verified agent jar. No update is necessary. Expanded the channel window size to 4MB [11/02/20 13:34:48] [SSH] Starting agent process: cd "c:\jenkins" && java -jar remoting.jar -workDir c:\jenkins -jar-cache c:\jenkins/remoting/jarCache Nov 02, 2020 1:34:48 PM org.jenkinsci.remoting.engine.WorkDirManager initializeWorkDir INFO: Using c:\jenkins\remoting as a remoting work directory Nov 02, 2020 1:34:48 PM org.jenkinsci.remoting.engine.WorkDirManager setupLogging INFO: Both error and output logs will be printed to c:\jenkins\remoting <===[JENKINS REMOTING CAPACITY]===>channel started Remoting version: 4.5 This is a Windows agent Agent successfully connected and online channel stoppedConnection terminated <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< Agent JVM has terminated. Exit code=0 <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< [11/03/20 10:22:01] [SSH] Connection closed. SSHLauncher{host='test-64', port=22, credentialsId='74dce9aa-167d-4df1-9269-d65e38820332', jvmOptions='', javaPath='', prefixStartSlaveCmd='', suffixStartSlaveCmd='', launchTimeoutSeconds=60, maxNumRetries=10, retryWaitTime=15, sshHostKeyVerificationStrategy=hudson.plugins.sshslaves.verifiers.KnownHostsFileKeyVerificationStrategy, tcpNoDelay=true, trackCredentials=true} [11/03/20 10:22:44] [SSH] Opening SSH connection to test-64:22. Searching for test-64 in C:\Users\Build\.ssh\known_hosts Searching for test-64:22 in C:\Users\Build\.ssh\known_hosts [11/03/20 10:22:45] [SSH] SSH host key matches key in Known Hosts file. Connection will be allowed. [11/03/20 10:22:45] [SSH] Authentication successful. [11/03/20 10:22:45] [SSH] The remote user's environment is: Now I'm starting to wonder if the SSH Server service needs to be restarted so that it slurps up the new environment variables, so that new SSH clients going through the Server will get the new vars?

          >Now I'm starting to wonder if the SSH Server service needs to be restarted so that it slurps up the new environment variables, so that new SSH clients going through the Server will get the new vars?

          On Unix you can use `~/.ssh/environment` file to set the user variables for SSH sessions, the nice thing with that file is you can copy it through SSH so you will not need to login on those windows machines.

          ~/.ssh/environment
          This file is read into the environment at login (if it exists). It can only contain empty lines, comment lines (that start with ‘#’), and assignment lines of the form name=value. The file should be writable only by the user; it need not be readable by anyone else. Environment processing is disabled by default and is controlled via the PermitUserEnvironment option.`

          also, I've found this https://serverfault.com/questions/100438/windows-user-environment-variables-not-available-in-openssh-session that probably explains it

          ifernandezcalvo Ivan Fernandez Calvo added a comment - >Now I'm starting to wonder if the SSH Server service needs to be restarted so that it slurps up the new environment variables, so that new SSH clients going through the Server will get the new vars? On Unix you can use `~/.ssh/environment` file to set the user variables for SSH sessions, the nice thing with that file is you can copy it through SSH so you will not need to login on those windows machines. ~/.ssh/environment This file is read into the environment at login (if it exists). It can only contain empty lines, comment lines (that start with ‘#’), and assignment lines of the form name=value. The file should be writable only by the user; it need not be readable by anyone else. Environment processing is disabled by default and is controlled via the PermitUserEnvironment option.` also, I've found this https://serverfault.com/questions/100438/windows-user-environment-variables-not-available-in-openssh-session that probably explains it
          rocha_stratovan John Rocha added a comment -

          Thanks for the links and suggestions.

          I was able to get the Windows environment read, eventually, it worked after rebooting the machine. Which is what makes me think maybe resetting the OpenSSH service. These aren't account specific windows environment variables, they are machine level global environment variables.

          I'll definitely look into the ~/.ssh/environment for my Linux deploys, although I haven't had problems with them yet.

          Thanks again!

          rocha_stratovan John Rocha added a comment - Thanks for the links and suggestions. I was able to get the Windows environment read, eventually, it worked after rebooting the machine. Which is what makes me think maybe resetting the OpenSSH service. These aren't account specific windows environment variables, they are machine level global environment variables. I'll definitely look into the ~/.ssh/environment for my Linux deploys, although I haven't had problems with them yet. Thanks again!

          The file ~/.ssh/environment Probably works also in windows check the folder c:\Users\YOUR_JENKINS_AGENT_USER\.ssh if the folder exist you can use the environment file

          ifernandezcalvo Ivan Fernandez Calvo added a comment - The file ~/.ssh/environment Probably works also in windows check the folder c:\Users\YOUR_JENKINS_AGENT_USER\.ssh if the folder exist you can use the environment file

          People

            ifernandezcalvo Ivan Fernandez Calvo
            rocha_stratovan John Rocha
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: