Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-60292

EC2 Plugin Thread Leak

    XMLWordPrintable

Details

    • 1.66

    Description

      We run a jenkins system that launches up to 120 ubuntu 18.04 ec2 nodes, and terminates the nodes down after an idle timeout of 5 min. We have observed that after several days of running nodes, the thread counts have increased until our system runs out of memory and eventually crashes with Out of Memory errors. The thread that looks to be created but never destroyed for each ec2 node:

      "Thread-10000" daemon prio=5 RUNNABLE
        java.net.SocketInputStream.socketRead0(Native Method)
        java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
        java.net.SocketInputStream.read(SocketInputStream.java:171)
        java.net.SocketInputStream.read(SocketInputStream.java:141)
        com.trilead.ssh2.crypto.cipher.CipherInputStream.fill_buffer(CipherInputStream.java:41)
        com.trilead.ssh2.crypto.cipher.CipherInputStream.internal_read(CipherInputStream.java:52)
        com.trilead.ssh2.crypto.cipher.CipherInputStream.getBlock(CipherInputStream.java:79)
        com.trilead.ssh2.crypto.cipher.CipherInputStream.read(CipherInputStream.java:108)
        com.trilead.ssh2.transport.TransportConnection.receiveMessage(TransportConnection.java:232)
        com.trilead.ssh2.transport.TransportManager.receiveLoop(TransportManager.java:706)
        com.trilead.ssh2.transport.TransportManager$1.run(TransportManager.java:502)
        java.lang.Thread.run(Thread.java:748)

       

      Below is an snippet of our cloud config (scrubbed of all user specific data):

       

      <clouds>
      <hudson.plugins.ec2.EC2Cloud plugin="ec2@1.45">
      <name>cloud-name</name>
      <useInstanceProfileForCredentials>false</useInstanceProfileForCredentials>
      <roleArn></roleArn>
      <roleSessionName></roleSessionName>
      <credentialsId>cred</credentialsId>
      <privateKey>
      <privateKey>privatekey</privateKey>
      </privateKey>
      <instanceCap>140</instanceCap>
      <templates>
      <hudson.plugins.ec2.SlaveTemplate>
      <ami>ami-id</ami>
      <description>description</description>
      <zone></zone>
      <securityGroups>security-groups</securityGroups>
      <remoteFS>/home/user</remoteFS>
      <type>size</type>
      <ebsOptimized>true</ebsOptimized>
      <monitoring>true</monitoring>
      <t2Unlimited>false</t2Unlimited>
      <labels>custom-label</labels>
      <mode>NORMAL</mode>
      <initScript>#!/bin/bash -xe

      echo "Hello world"

      </initScript>
      <tmpDir></tmpDir>
      <userData></userData>
      <numExecutors>1</numExecutors>
      <remoteAdmin>user</remoteAdmin>
      <jvmopts></jvmopts>
      <subnetId>subnet-ids</subnetId>
      <idleTerminationMinutes>5</idleTerminationMinutes>
      <iamInstanceProfile>iam-profile</iamInstanceProfile>
      <deleteRootOnTermination>true</deleteRootOnTermination>
      <useEphemeralDevices>false</useEphemeralDevices>
      <customDeviceMapping>/dev/sda1=:100:true:gp2</customDeviceMapping>
      <instanceCap>120</instanceCap>
      <stopOnTerminate>false</stopOnTerminate>
      <tags>
      <hudson.plugins.ec2.EC2Tag>
      <name>Name</name>
      <value>ec2name</value>
      </hudson.plugins.ec2.EC2Tag>
      <hudson.plugins.ec2.EC2Tag>
      <name>tag</name>
      <value>ec2tag</value>
      </hudson.plugins.ec2.EC2Tag>
      </tags>
      <connectionStrategy>PRIVATE_IP</connectionStrategy>
      <associatePublicIp>false</associatePublicIp>
      <useDedicatedTenancy>false</useDedicatedTenancy>
      <amiType class="hudson.plugins.ec2.UnixData"/>
      <launchTimeout>2147483647</launchTimeout>
      <connectBySSHProcess>true</connectBySSHProcess>
      <maxTotalUses>-1</maxTotalUses>
      <nextSubnet>0</nextSubnet>
      </hudson.plugins.ec2.SlaveTemplate>

      <region>region</region>
      <noDelayProvisioning>false</noDelayProvisioning>
      </hudson.plugins.ec2.EC2Cloud>
      </clouds>

       

      It looks to be related to the connectBySSHProcess option, when we set this value to false, we no longer see the Thread leak.

      Attachments

        Issue Links

          Activity

            jglick Jesse Glick added a comment -

            I wonder if https://github.com/jenkinsci/ec2-plugin/pull/222 would address this.

            jglick Jesse Glick added a comment - I wonder if https://github.com/jenkinsci/ec2-plugin/pull/222 would address this.

            People

              jglick Jesse Glick
              jakemroz Jake Mroz
              Votes:
              1 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: