Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-49097

Ssh-agent-plugin doesn't kill ssh-agent in top-level matrix jobs

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Major Major
    • ssh-agent-plugin
    • None
    • Jenkins 2.32.3
      ssh-agent-plugin 1.15

      Ssh-agent-plugin starts, but does not kill ssh-agent processes in top-level matrix jobs.

      00:00:00.052 [ssh-agent] Looking for ssh-agent implementation...
      00:00:00.167 [ssh-agent]   Exec ssh-agent (binary ssh-agent on a remote machine)
      00:00:00.189 $ ssh-agent
      00:00:00.278 SSH_AUTH_SOCK=/tmp/ssh-T6i78P9tKd5A/agent.28069
      00:00:00.278 SSH_AGENT_PID=28071
      00:00:00.278 [ssh-agent] Started.
      00:00:00.389 $ ssh-add /home/tcwg-buildslave/workspace/tcwg-upstream-monitoring_tmp/private_key_1495902688254844701.key
      00:00:00.408 Identity added: /home/tcwg-buildslave/workspace/tcwg-upstream-monitoring_tmp/private_key_1495902688254844701.key (/home/tcwg-buildslave/workspace/tcwg-upstream-monitoring_tmp/private_key_1495902688254844701.key)
      00:00:00.520 [ssh-agent] Using credentials tcwg-buildslave (buildslave for TCWG machines)
      00:00:00.542 Set build name.
      00:00:00.543 Triggering TCWG Upstream Monitoring » gcc-master,tcwg-x86_64-build
      00:00:05.545 Configuration TCWG Upstream Monitoring » gcc-master,tcwg-x86_64-build is still in the queue: Waiting for next available executor on tcwg-x86_64-build
      06:43:08.741 TCWG Upstream Monitoring » gcc-master,tcwg-x86_64-build completed with result FAILURE
      06:43:08.902 Set build name.
      06:43:08.905 Unrecognized macro 'branch' in '${branch} #399'
      06:43:08.907 Finished: FAILURE
      

      Since top-level matrix job only spawns child jobs, it doesn't really need access to ssh-agent keys (note that SCM clones/checkouts use their own interface to ssh-agent-plugin).  Therefore ssh-agent-plugin can either not start ssh-agent for top-level matrix jobs at all, or terminate them during cleanup.  It is not clear why existing cleanup code does not trigger for top-level matrix jobs.

      This issue is causing thousands of ssh-agent processes to accumulate on busy systems.  To cleanup these jobs one needs to wait till system is idle to avoid killing the few active ssh-agent processes.  Busy systems, unfortunately, are rarely idle.

            Unassigned Unassigned
            maxim_kuvyrkov Maxim Kuvyrkov
            Votes:
            1 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: