Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-43889

ssh-agent-plugin leaking some ssh-agent processes

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Major Major
    • core, ssh-agent-plugin
    • None
    • Jenkins 2.32.3, 2.190.2
      ssh-agent-plugin 1.15, 1.17
    • Jenkins 2.257

      When a job with the SSHAgentBuildWrapper enabled fails very early (for instance during SCM checkout), an ssh-agent process is left behind. The issue is that the SSHAgentEnvironment is instantiated very early (from preCheckout), but its tearDown method will only be called if execution reaches BuildExecution.doRun (which comes after the SCM checkout phase in AbstractBuildExecution.run).

      Before ssh-agent-plugin 1.14, there was no ssh-agent process, so the issue with some SSHAgentEnvironment not being torn down was less visible (but probably there was already some other kind of less obvious resources leaks with AgentServer not being properly closed).

      This kind of issue with some Environment not being properly torn down can happen as soon as they are not instantiated from BuildWrapper.setUp, but from earlier phases (like BuildWrapper.preCheckout or RunListener.setUpEnvironment). As such, maybe that's something that should be fixed in core (maybe in AbstractBuildExecution.run) rather than specifically in the ssh-agent-plugin, I don't know...

      I've written and attached a "generic workaround" RunListener, which tries to detect this situation from onComplete, and call tearDown for all Environment if it has not been done already. It's not something I propose for inclusion, but rather some code to exhibit the issue. If an ssh-agent specific fix is desirable, then a similar approach might be an option (but targeting SSHAgentEnvironment only).

          [JENKINS-43889] ssh-agent-plugin leaking some ssh-agent processes

          Thomas de Grenier de Latour created issue -
          Thomas de Grenier de Latour made changes -
          Description Original: When a job with the {{SSHAgentBuildWrapper}} enabled fails very early (for instance during SCM checkout), an {{ssh-agent}} process is left behind. The issue is that the {{SSHAgentEnvironment}} is instantiated very early (from {{preCheckout}}), but its {{tearDown}} method will only be called if execution reaches {{BuildExecution.doRun}} (which comes after the SCM checkout phase in {{AbstractBuildExecution.run}}).

          Before {{ssh-agent-plugin 1.14}}, there was no {{ssh-agent}} process, so the issue with some {{SSHAgentEnvironment}}s not being teared down was less visible (but probably there was already some other kind of less obvious resources leaks with {{AgentServer}}s not being properly closed).

          This kind of issue with some {{Environment}}s not being properly teared down can happen as soon as they are not instantiated from {{BuildWrapper.setUp}}, but from earlier phases (like {{BuildWrapper.preCheckout}} or {{RunListener.setUpEnvironment}}). As such, maybe that's something that should be fixed in core (maybe in {{AbstractBuildExecution.run}}) rather than specifically in the {{ssh-agent-plugin}}, I don't know...

          I've written and attached a "generic workaround" {{RunListener}}, which tries to detect this situation from {{onComplete}}, and call {{tearDown}} for all {{Environment}}s if it has not been done already. It's not something I propose for inclusion, but rather some code to exhibit the issue. If an ssh-agent specific fix is desirable, then a similar approach might be an option (but targeting {{SSHAgentEnvironment}} only).
          New: When a job with the {{SSHAgentBuildWrapper}} enabled fails very early (for instance during SCM checkout), an {{ssh-agent}} process is left behind. The issue is that the {{SSHAgentEnvironment}} is instantiated very early (from {{preCheckout}}), but its {{tearDown}} method will only be called if execution reaches {{BuildExecution.doRun}} (which comes after the SCM checkout phase in {{AbstractBuildExecution.run}}).

          Before {{ssh-agent-plugin 1.14}}, there was no {{ssh-agent}} process, so the issue with some {{SSHAgentEnvironment}} not being teared down was less visible (but probably there was already some other kind of less obvious resources leaks with {{AgentServer}} not being properly closed).

          This kind of issue with some {{Environment}} not being properly teared down can happen as soon as they are not instantiated from {{BuildWrapper.setUp}}, but from earlier phases (like {{BuildWrapper.preCheckout}} or {{RunListener.setUpEnvironment}}). As such, maybe that's something that should be fixed in core (maybe in {{AbstractBuildExecution.run}}) rather than specifically in the {{ssh-agent-plugin}}, I don't know...

          I've written and attached a "generic workaround" {{RunListener}}, which tries to detect this situation from {{onComplete}}, and call {{tearDown}} for all {{Environment}} if it has not been done already. It's not something I propose for inclusion, but rather some code to exhibit the issue. If an ssh-agent specific fix is desirable, then a similar approach might be an option (but targeting {{SSHAgentEnvironment}} only).
          Thomas de Grenier de Latour made changes -
          Environment Original: Jenkins 2.32.3
          ssh-agent-plugin 1.15
          New: Jenkins 2.32.3, 2.190.2
          ssh-agent-plugin 1.15, 1.17
          Thomas de Grenier de Latour made changes -
          Component/s New: core [ 15593 ]
          Thomas de Grenier de Latour made changes -
          Assignee New: Thomas de Grenier de Latour [ tom_gl ]
          Thomas de Grenier de Latour made changes -
          Status Original: Open [ 1 ] New: In Progress [ 3 ]
          Thomas de Grenier de Latour made changes -
          Description Original: When a job with the {{SSHAgentBuildWrapper}} enabled fails very early (for instance during SCM checkout), an {{ssh-agent}} process is left behind. The issue is that the {{SSHAgentEnvironment}} is instantiated very early (from {{preCheckout}}), but its {{tearDown}} method will only be called if execution reaches {{BuildExecution.doRun}} (which comes after the SCM checkout phase in {{AbstractBuildExecution.run}}).

          Before {{ssh-agent-plugin 1.14}}, there was no {{ssh-agent}} process, so the issue with some {{SSHAgentEnvironment}} not being teared down was less visible (but probably there was already some other kind of less obvious resources leaks with {{AgentServer}} not being properly closed).

          This kind of issue with some {{Environment}} not being properly teared down can happen as soon as they are not instantiated from {{BuildWrapper.setUp}}, but from earlier phases (like {{BuildWrapper.preCheckout}} or {{RunListener.setUpEnvironment}}). As such, maybe that's something that should be fixed in core (maybe in {{AbstractBuildExecution.run}}) rather than specifically in the {{ssh-agent-plugin}}, I don't know...

          I've written and attached a "generic workaround" {{RunListener}}, which tries to detect this situation from {{onComplete}}, and call {{tearDown}} for all {{Environment}} if it has not been done already. It's not something I propose for inclusion, but rather some code to exhibit the issue. If an ssh-agent specific fix is desirable, then a similar approach might be an option (but targeting {{SSHAgentEnvironment}} only).
          New: When a job with the {{SSHAgentBuildWrapper}} enabled fails very early (for instance during SCM checkout), an {{ssh-agent}} process is left behind. The issue is that the {{SSHAgentEnvironment}} is instantiated very early (from {{preCheckout}}), but its {{tearDown}} method will only be called if execution reaches {{BuildExecution.doRun}} (which comes after the SCM checkout phase in {{AbstractBuildExecution.run}}).

          Before {{ssh-agent-plugin 1.14}}, there was no {{ssh-agent}} process, so the issue with some {{SSHAgentEnvironment}} not being torn down was less visible (but probably there was already some other kind of less obvious resources leaks with {{AgentServer}} not being properly closed).

          This kind of issue with some {{Environment}} not being properly torn down can happen as soon as they are not instantiated from {{BuildWrapper.setUp}}, but from earlier phases (like {{BuildWrapper.preCheckout}} or {{RunListener.setUpEnvironment}}). As such, maybe that's something that should be fixed in core (maybe in {{AbstractBuildExecution.run}}) rather than specifically in the {{ssh-agent-plugin}}, I don't know...

          I've written and attached a "generic workaround" {{RunListener}}, which tries to detect this situation from {{onComplete}}, and call {{tearDown}} for all {{Environment}} if it has not been done already. It's not something I propose for inclusion, but rather some code to exhibit the issue. If an ssh-agent specific fix is desirable, then a similar approach might be an option (but targeting {{SSHAgentEnvironment}} only).
          Mark Waite made changes -
          Released As New: Jenkins 2.257
          Resolution New: Fixed [ 1 ]
          Status Original: In Progress [ 3 ] New: Resolved [ 5 ]
          Mark Waite made changes -
          Status Original: Resolved [ 5 ] New: Closed [ 6 ]
          Oleg Nenashev made changes -
          Labels New: lts-candidate

            tom_gl Thomas de Grenier de Latour
            tom_gl Thomas de Grenier de Latour
            Votes:
            4 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated:
              Resolved: