• Icon: Bug Bug
    • Resolution: Cannot Reproduce
    • Icon: Major Major
    • None
    • Centos 6.6
      Jenkins 1.621

      See attachment system_info.txt for full dump.

      This issue was originally opened off the stash-build-plugin but after further investigation we found that the cron system of core jenkins stop working as expected.

      Approximately weekly as of now with the introduction of more cron based plugins like stash-pullrequest-builder; the cron system stops working. With jenkins_debug_10072015.tar attachment; the system indicated for most jobs a last time polling of around 9:05-9:10am. All polling cron type jobs no longer responded.

      Very close to JENKINS-25704; almost a duplicate.

        1. triggers_fine_log_10_8_2015.txt
          47 kB
        2. system_info.txt
          10 kB
        3. support_2015-09-20_20.18.47.zip
          2.00 MB
        4. screenshot-2.png
          screenshot-2.png
          16 kB
        5. screenshot-1.png
          screenshot-1.png
          3 kB
        6. jenkins_threaddump_2.txt
          67 kB
        7. jenkins_threaddump_1.txt
          66 kB
        8. jenkins_logs.txt
          75 kB
        9. jenkins_cron_problem-10-7-2015.zip
          3.18 MB
        10. debug_sockets.txt
          14 kB

          [JENKINS-30558] Cron based jobs are no longer triggered

          Moving over to core. We saw this issue 2 times since then and found other areas of the system are impacted. The number of jobs + introduction of the cron polling to stash is putting a heavier load on system thus we are seeing the issue almost weekly now.

          Jonathan Strickland added a comment - Moving over to core. We saw this issue 2 times since then and found other areas of the system are impacted. The number of jobs + introduction of the cron polling to stash is putting a heavier load on system thus we are seeing the issue almost weekly now.

          Adding jenkins_cron_problem-10-7-2015 with all support, lsof dump, sysconfig, ect... As much as I could initially gather.

          Jonathan Strickland added a comment - Adding jenkins_cron_problem-10-7-2015 with all support, lsof dump, sysconfig, ect... As much as I could initially gather.

          Jonathan Strickland added a comment - - edited

          Hit the issue on 10/8/2015 too. 3:39:00 PM was the last log that was logged by the logger. This correlates with all the polling jobs no longer being scheduled.

          Only interesting item I found in the jenkins log was an exception at 3:37:00

          Oct 08, 2015 3:37:13 PM org.eclipse.jetty.util.log.JavaUtilLog warn
          WARNING:
          java.nio.channels.ClosedChannelException
          at sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:270)
          at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:479)
          at org.eclipse.jetty.io.nio.ChannelEndPoint.flush(ChannelEndPoint.java:293)
          at org.eclipse.jetty.io.nio.SelectChannelEndPoint.flush(SelectChannelEndPoint.java:402)
          at org.eclipse.jetty.io.nio.SslConnection.process(SslConnection.java:337)
          at org.eclipse.jetty.io.nio.SslConnection.access$900(SslConnection.java:48)
          at org.eclipse.jetty.io.nio.SslConnection$SslEndPoint.flush(SslConnection.java:738)
          at org.eclipse.jetty.io.nio.SslConnection$SslEndPoint.shutdownOutput(SslConnection.java:641)
          at org.eclipse.jetty.io.nio.SslConnection.onIdleExpired(SslConnection.java:260)
          at org.eclipse.jetty.io.nio.SelectChannelEndPoint.onIdleExpired(SelectChannelEndPoint.java:349)
          at org.eclipse.jetty.io.nio.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:326)
          at winstone.BoundedExecutorService$1.run(BoundedExecutorService.java:77)
          at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
          at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
          at java.lang.Thread.run(Thread.java:745)

          Oct 08, 2015 3:38:00 PM stashpullrequestbuilder.stashpullrequestbuilder.StashPullRequestsBuilder run

          Jonathan Strickland added a comment - - edited Hit the issue on 10/8/2015 too. 3:39:00 PM was the last log that was logged by the logger. This correlates with all the polling jobs no longer being scheduled. Only interesting item I found in the jenkins log was an exception at 3:37:00 Oct 08, 2015 3:37:13 PM org.eclipse.jetty.util.log.JavaUtilLog warn WARNING: java.nio.channels.ClosedChannelException at sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:270) at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:479) at org.eclipse.jetty.io.nio.ChannelEndPoint.flush(ChannelEndPoint.java:293) at org.eclipse.jetty.io.nio.SelectChannelEndPoint.flush(SelectChannelEndPoint.java:402) at org.eclipse.jetty.io.nio.SslConnection.process(SslConnection.java:337) at org.eclipse.jetty.io.nio.SslConnection.access$900(SslConnection.java:48) at org.eclipse.jetty.io.nio.SslConnection$SslEndPoint.flush(SslConnection.java:738) at org.eclipse.jetty.io.nio.SslConnection$SslEndPoint.shutdownOutput(SslConnection.java:641) at org.eclipse.jetty.io.nio.SslConnection.onIdleExpired(SslConnection.java:260) at org.eclipse.jetty.io.nio.SelectChannelEndPoint.onIdleExpired(SelectChannelEndPoint.java:349) at org.eclipse.jetty.io.nio.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:326) at winstone.BoundedExecutorService$1.run(BoundedExecutorService.java:77) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Oct 08, 2015 3:38:00 PM stashpullrequestbuilder.stashpullrequestbuilder.StashPullRequestsBuilder run

          The issue may possibly be a result of stashbuilder after looking at the jstack trace on the master. I attached 2 jstack traces 5 minutes apart, see thread 23311. Its sitting at java.net.SocketInputStream.socketRead0(java.io.FileDescriptor, byte[], int, int, int). I'm not very familiar with the Timer code; but was wondering if this single job's socket not completelying in the timertask could cause impact to the whole system.

          Thread 23311: (state = IN_NATIVE)

          • java.net.SocketInputStream.socketRead0(java.io.FileDescriptor, byte[], int, int, int) @bci=0 (Compiled frame; information may be imprecise)
          • java.net.SocketInputStream.read(byte[], int, int, int) @bci=87, line=152 (Compiled frame)
          • java.net.SocketInputStream.read(byte[], int, int) @bci=11, line=122 (Compiled frame)
          • sun.security.ssl.InputRecord.readFully(java.io.InputStream, byte[], int, int) @bci=21, line=442 (Compiled frame)
          • sun.security.ssl.InputRecord.read(java.io.InputStream, java.io.OutputStream) @bci=32, line=480 (Compiled frame)
          • sun.security.ssl.SSLSocketImpl.readRecord(sun.security.ssl.InputRecord, boolean) @bci=44, line=934 (Compiled frame)
          • sun.security.ssl.SSLSocketImpl.readDataRecord(sun.security.ssl.InputRecord) @bci=15, line=891 (Compiled frame)
          • sun.security.ssl.AppInputStream.read(byte[], int, int) @bci=72, line=102 (Compiled frame)
          • java.io.BufferedInputStream.fill() @bci=175, line=235 (Interpreted frame)
          • java.io.BufferedInputStream.read() @bci=12, line=254 (Compiled frame)
          • org.apache.commons.httpclient.HttpParser.readRawLine(java.io.InputStream) @bci=19, line=78 (Compiled frame)
          • org.apache.commons.httpclient.HttpParser.readLine(java.io.InputStream, java.lang.String) @bci=11, line=106 (Interpreted frame)
          • org.apache.commons.httpclient.HttpConnection.readLine(java.lang.String) @bci=19, line=1116 (Interpreted frame)
          • org.apache.commons.httpclient.HttpMethodBase.readStatusLine(org.apache.commons.httpclient.HttpState, org.apache.commons.httpclient.HttpConnection) @bci=36, line=1973 (Interpreted frame)
          • org.apache.commons.httpclient.HttpMethodBase.readResponse(org.apache.commons.httpclient.HttpState, org.apache.commons.httpclient.HttpConnection) @bci=21, line=1735 (Compiled frame)
          • org.apache.commons.httpclient.HttpMethodBase.execute(org.apache.commons.httpclient.HttpState, org.apache.commons.httpclient.HttpConnection) @bci=68, line=1098 (Interpreted frame)
          • org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(org.apache.commons.httpclient.HttpMethod) @bci=135, line=398 (Interpreted frame)
          • org.apache.commons.httpclient.HttpMethodDirector.executeMethod(org.apache.commons.httpclient.HttpMethod) @bci=288, line=171 (Interpreted frame)
          • org.apache.commons.httpclient.HttpClient.executeMethod(org.apache.commons.httpclient.HostConfiguration, org.apache.commons.httpclient.HttpMethod, org.apache.commons.httpclient.HttpState) @bci=114, line=397 (Interpreted frame)
          • org.apache.commons.httpclient.HttpClient.executeMethod(org.apache.commons.httpclient.HttpMethod) @bci=14, line=323 (Interpreted frame)
          • stashpullrequestbuilder.stashpullrequestbuilder.stash.StashApiClient.getRequest(java.lang.String) @bci=69, line=140 (Interpreted frame)
          • stashpullrequestbuilder.stashpullrequestbuilder.stash.StashApiClient.getPullRequests() @bci=5, line=44 (Interpreted frame)
          • stashpullrequestbuilder.stashpullrequestbuilder.StashRepository.getTargetPullRequests() @bci=12, line=57 (Interpreted frame)
          • stashpullrequestbuilder.stashpullrequestbuilder.StashPullRequestsBuilder.run() @bci=19, line=30 (Interpreted frame)
          • stashpullrequestbuilder.stashpullrequestbuilder.StashBuildTrigger.run() @bci=28, line=168 (Interpreted frame)
          • hudson.triggers.Trigger.checkTriggers(java.util.Calendar) @bci=253, line=278 (Compiled frame)
          • hudson.triggers.Trigger$Cron.doRun() @bci=43, line=217 (Interpreted frame)
          • hudson.triggers.SafeTimerTask.run() @bci=8, line=51 (Interpreted frame)
          • java.util.concurrent.Executors$RunnableAdapter.call() @bci=4, line=471 (Compiled frame)
          • java.util.concurrent.FutureTask.runAndReset() @bci=47, line=304 (Interpreted frame)
          • java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask) @bci=1, line=178 (Interpreted frame)
          • java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run() @bci=37, line=293 (Interpreted frame)
          • java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker) @bci=95, line=1145 (Compiled frame)
          • java.util.concurrent.ThreadPoolExecutor$Worker.run() @bci=5, line=615 (Interpreted frame)
          • java.lang.Thread.run() @bci=11, line=745 (Interpreted frame)

          Jonathan Strickland added a comment - The issue may possibly be a result of stashbuilder after looking at the jstack trace on the master. I attached 2 jstack traces 5 minutes apart, see thread 23311. Its sitting at java.net.SocketInputStream.socketRead0(java.io.FileDescriptor, byte[], int, int, int). I'm not very familiar with the Timer code; but was wondering if this single job's socket not completelying in the timertask could cause impact to the whole system. Thread 23311: (state = IN_NATIVE) java.net.SocketInputStream.socketRead0(java.io.FileDescriptor, byte[], int, int, int) @bci=0 (Compiled frame; information may be imprecise) java.net.SocketInputStream.read(byte[], int, int, int) @bci=87, line=152 (Compiled frame) java.net.SocketInputStream.read(byte[], int, int) @bci=11, line=122 (Compiled frame) sun.security.ssl.InputRecord.readFully(java.io.InputStream, byte[], int, int) @bci=21, line=442 (Compiled frame) sun.security.ssl.InputRecord.read(java.io.InputStream, java.io.OutputStream) @bci=32, line=480 (Compiled frame) sun.security.ssl.SSLSocketImpl.readRecord(sun.security.ssl.InputRecord, boolean) @bci=44, line=934 (Compiled frame) sun.security.ssl.SSLSocketImpl.readDataRecord(sun.security.ssl.InputRecord) @bci=15, line=891 (Compiled frame) sun.security.ssl.AppInputStream.read(byte[], int, int) @bci=72, line=102 (Compiled frame) java.io.BufferedInputStream.fill() @bci=175, line=235 (Interpreted frame) java.io.BufferedInputStream.read() @bci=12, line=254 (Compiled frame) org.apache.commons.httpclient.HttpParser.readRawLine(java.io.InputStream) @bci=19, line=78 (Compiled frame) org.apache.commons.httpclient.HttpParser.readLine(java.io.InputStream, java.lang.String) @bci=11, line=106 (Interpreted frame) org.apache.commons.httpclient.HttpConnection.readLine(java.lang.String) @bci=19, line=1116 (Interpreted frame) org.apache.commons.httpclient.HttpMethodBase.readStatusLine(org.apache.commons.httpclient.HttpState, org.apache.commons.httpclient.HttpConnection) @bci=36, line=1973 (Interpreted frame) org.apache.commons.httpclient.HttpMethodBase.readResponse(org.apache.commons.httpclient.HttpState, org.apache.commons.httpclient.HttpConnection) @bci=21, line=1735 (Compiled frame) org.apache.commons.httpclient.HttpMethodBase.execute(org.apache.commons.httpclient.HttpState, org.apache.commons.httpclient.HttpConnection) @bci=68, line=1098 (Interpreted frame) org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(org.apache.commons.httpclient.HttpMethod) @bci=135, line=398 (Interpreted frame) org.apache.commons.httpclient.HttpMethodDirector.executeMethod(org.apache.commons.httpclient.HttpMethod) @bci=288, line=171 (Interpreted frame) org.apache.commons.httpclient.HttpClient.executeMethod(org.apache.commons.httpclient.HostConfiguration, org.apache.commons.httpclient.HttpMethod, org.apache.commons.httpclient.HttpState) @bci=114, line=397 (Interpreted frame) org.apache.commons.httpclient.HttpClient.executeMethod(org.apache.commons.httpclient.HttpMethod) @bci=14, line=323 (Interpreted frame) stashpullrequestbuilder.stashpullrequestbuilder.stash.StashApiClient.getRequest(java.lang.String) @bci=69, line=140 (Interpreted frame) stashpullrequestbuilder.stashpullrequestbuilder.stash.StashApiClient.getPullRequests() @bci=5, line=44 (Interpreted frame) stashpullrequestbuilder.stashpullrequestbuilder.StashRepository.getTargetPullRequests() @bci=12, line=57 (Interpreted frame) stashpullrequestbuilder.stashpullrequestbuilder.StashPullRequestsBuilder.run() @bci=19, line=30 (Interpreted frame) stashpullrequestbuilder.stashpullrequestbuilder.StashBuildTrigger.run() @bci=28, line=168 (Interpreted frame) hudson.triggers.Trigger.checkTriggers(java.util.Calendar) @bci=253, line=278 (Compiled frame) hudson.triggers.Trigger$Cron.doRun() @bci=43, line=217 (Interpreted frame) hudson.triggers.SafeTimerTask.run() @bci=8, line=51 (Interpreted frame) java.util.concurrent.Executors$RunnableAdapter.call() @bci=4, line=471 (Compiled frame) java.util.concurrent.FutureTask.runAndReset() @bci=47, line=304 (Interpreted frame) java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask) @bci=1, line=178 (Interpreted frame) java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run() @bci=37, line=293 (Interpreted frame) java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker) @bci=95, line=1145 (Compiled frame) java.util.concurrent.ThreadPoolExecutor$Worker.run() @bci=5, line=615 (Interpreted frame) java.lang.Thread.run() @bci=11, line=745 (Interpreted frame)

          I still haven’t gotten into the heart of the Jenkins Core code but apparently 1 bad plug-in, as pretty obvious the below debugs, can have a detrimental impact on the system if it hangs within its own timer task code. Using jstack on the Jenkins process; I found the hang happens deep at native socket code causing the Stash Trigger to never complete. Thread dump looks like this:

          Thread 23311: (state = IN_NATIVE)

          • java.net.SocketInputStream.socketRead0(java.io.FileDescriptor, byte[], int, int, int) @bci=0 (Compiled frame; information may be imprecise)
          • java.net.SocketInputStream.read(byte[], int, int, int) @bci=87, line=152 (Compiled frame)
          • java.net.SocketInputStream.read(byte[], int, int) @bci=11, line=122 (Compiled frame)
          • sun.security.ssl.InputRecord.readFully(java.io.InputStream, byte[], int, int) @bci=21, line=442 (Compiled frame)
          • sun.security.ssl.InputRecord.read(java.io.InputStream, java.io.OutputStream) @bci=32, line=480 (Compiled frame)
          • sun.security.ssl.SSLSocketImpl.readRecord(sun.security.ssl.InputRecord, boolean) @bci=44, line=934 (Compiled frame)
          • sun.security.ssl.SSLSocketImpl.readDataRecord(sun.security.ssl.InputRecord) @bci=15, line=891 (Compiled frame)
          • sun.security.ssl.AppInputStream.read(byte[], int, int) @bci=72, line=102 (Compiled frame)
          • java.io.BufferedInputStream.fill() @bci=175, line=235 (Interpreted frame)
          • java.io.BufferedInputStream.read() @bci=12, line=254 (Compiled frame)
          • org.apache.commons.httpclient.HttpParser.readRawLine(java.io.InputStream) @bci=19, line=78 (Compiled frame)
          • org.apache.commons.httpclient.HttpParser.readLine(java.io.InputStream, java.lang.String) @bci=11, line=106 (Interpreted frame)
          • org.apache.commons.httpclient.HttpConnection.readLine(java.lang.String) @bci=19, line=1116 (Interpreted frame)
          • org.apache.commons.httpclient.HttpMethodBase.readStatusLine(org.apache.commons.httpclient.HttpState, org.apache.commons.httpclient.HttpConnection) @bci=36, line=1973 (Interpreted frame)
          • org.apache.commons.httpclient.HttpMethodBase.readResponse(org.apache.commons.httpclient.HttpState, org.apache.commons.httpclient.HttpConnection) @bci=21, line=1735 (Compiled frame)
          • org.apache.commons.httpclient.HttpMethodBase.execute(org.apache.commons.httpclient.HttpState, org.apache.commons.httpclient.HttpConnection) @bci=68, line=1098 (Interpreted frame)
          • org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(org.apache.commons.httpclient.HttpMethod) @bci=135, line=398 (Interpreted frame)
          • org.apache.commons.httpclient.HttpMethodDirector.executeMethod(org.apache.commons.httpclient.HttpMethod) @bci=288, line=171 (Interpreted frame)
          • org.apache.commons.httpclient.HttpClient.executeMethod(org.apache.commons.httpclient.HostConfiguration, org.apache.commons.httpclient.HttpMethod, org.apache.commons.httpclient.HttpState) @bci=114, line=397 (Interpreted frame)
          • org.apache.commons.httpclient.HttpClient.executeMethod(org.apache.commons.httpclient.HttpMethod) @bci=14, line=323 (Interpreted frame)
          • stashpullrequestbuilder.stashpullrequestbuilder.stash.StashApiClient.getRequest(java.lang.String) @bci=69, line=140 (Interpreted frame)

          On the last hang-up I was able to kill the open socket and boom.. all cron jobs were back in business:

          [root@triad-jenkins ~]# lsof -p 23220 | grep TCP | grep stash
          java 23220 jenkins 894u IPv6 40628689 0t0 TCP triad-jenkins.cisco.com:60443->rtp-apl-stash1.cisco.com:https (ESTABLISHED)
          echo -e "call close(894)\nquit" > gdb_commands
          gdb -p 23220 --batch -x gdb_commands

          After analyzing the code for the stash pull request builder I found some areas where some defensive code could be added to prevent this (https://github.com/jenkinsci/stash-pullrequest-builder-plugin/blob/master/src/main/java/stashpullrequestbuilder/stashpullrequestbuilder/stash/StashApiClient.java):

          1. For the HttpClient set two parameters:
          a. ConnectionTimeout : This denotes the time elapsed before the connection established or Server responded to connection request.
          b. SoTimeout : Maximum period inactivity between two consecutive data packets arriving at client side after connection is established.
          2. Related to item 1; apparently these two settings may not always work properly in some conditions in accordance to this bug filed against the jdk, https://bugs.openjdk.java.net/browse/JDK-8049846. Thus for further defense; I added each api call to a FutureTask and start the task in a new thread with a limit time of 30 seconds. After 30 seconds; a concurrent Timeout exception is thrown allowing the parent thread to abort the task and the http request.

          This modification has been running on a private Jenkins server instance in our organization with positive results:

          1. The sockets are being closed after each request from the client; causing less resources on both client / server
          2. If a socket does get hung; we see the timer concurrent exception and the cron job is not being blocked

          I'll be opening a pull request within the next 1-2 days once the changes are documented properly inline of the code.

          Jonathan Strickland added a comment - I still haven’t gotten into the heart of the Jenkins Core code but apparently 1 bad plug-in, as pretty obvious the below debugs, can have a detrimental impact on the system if it hangs within its own timer task code. Using jstack on the Jenkins process; I found the hang happens deep at native socket code causing the Stash Trigger to never complete. Thread dump looks like this: Thread 23311: (state = IN_NATIVE) java.net.SocketInputStream.socketRead0(java.io.FileDescriptor, byte[], int, int, int) @bci=0 (Compiled frame; information may be imprecise) java.net.SocketInputStream.read(byte[], int, int, int) @bci=87, line=152 (Compiled frame) java.net.SocketInputStream.read(byte[], int, int) @bci=11, line=122 (Compiled frame) sun.security.ssl.InputRecord.readFully(java.io.InputStream, byte[], int, int) @bci=21, line=442 (Compiled frame) sun.security.ssl.InputRecord.read(java.io.InputStream, java.io.OutputStream) @bci=32, line=480 (Compiled frame) sun.security.ssl.SSLSocketImpl.readRecord(sun.security.ssl.InputRecord, boolean) @bci=44, line=934 (Compiled frame) sun.security.ssl.SSLSocketImpl.readDataRecord(sun.security.ssl.InputRecord) @bci=15, line=891 (Compiled frame) sun.security.ssl.AppInputStream.read(byte[], int, int) @bci=72, line=102 (Compiled frame) java.io.BufferedInputStream.fill() @bci=175, line=235 (Interpreted frame) java.io.BufferedInputStream.read() @bci=12, line=254 (Compiled frame) org.apache.commons.httpclient.HttpParser.readRawLine(java.io.InputStream) @bci=19, line=78 (Compiled frame) org.apache.commons.httpclient.HttpParser.readLine(java.io.InputStream, java.lang.String) @bci=11, line=106 (Interpreted frame) org.apache.commons.httpclient.HttpConnection.readLine(java.lang.String) @bci=19, line=1116 (Interpreted frame) org.apache.commons.httpclient.HttpMethodBase.readStatusLine(org.apache.commons.httpclient.HttpState, org.apache.commons.httpclient.HttpConnection) @bci=36, line=1973 (Interpreted frame) org.apache.commons.httpclient.HttpMethodBase.readResponse(org.apache.commons.httpclient.HttpState, org.apache.commons.httpclient.HttpConnection) @bci=21, line=1735 (Compiled frame) org.apache.commons.httpclient.HttpMethodBase.execute(org.apache.commons.httpclient.HttpState, org.apache.commons.httpclient.HttpConnection) @bci=68, line=1098 (Interpreted frame) org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(org.apache.commons.httpclient.HttpMethod) @bci=135, line=398 (Interpreted frame) org.apache.commons.httpclient.HttpMethodDirector.executeMethod(org.apache.commons.httpclient.HttpMethod) @bci=288, line=171 (Interpreted frame) org.apache.commons.httpclient.HttpClient.executeMethod(org.apache.commons.httpclient.HostConfiguration, org.apache.commons.httpclient.HttpMethod, org.apache.commons.httpclient.HttpState) @bci=114, line=397 (Interpreted frame) org.apache.commons.httpclient.HttpClient.executeMethod(org.apache.commons.httpclient.HttpMethod) @bci=14, line=323 (Interpreted frame) stashpullrequestbuilder.stashpullrequestbuilder.stash.StashApiClient.getRequest(java.lang.String) @bci=69, line=140 (Interpreted frame) On the last hang-up I was able to kill the open socket and boom.. all cron jobs were back in business: [root@triad-jenkins ~] # lsof -p 23220 | grep TCP | grep stash java 23220 jenkins 894u IPv6 40628689 0t0 TCP triad-jenkins.cisco.com:60443->rtp-apl-stash1.cisco.com:https (ESTABLISHED) echo -e "call close(894)\nquit" > gdb_commands gdb -p 23220 --batch -x gdb_commands After analyzing the code for the stash pull request builder I found some areas where some defensive code could be added to prevent this ( https://github.com/jenkinsci/stash-pullrequest-builder-plugin/blob/master/src/main/java/stashpullrequestbuilder/stashpullrequestbuilder/stash/StashApiClient.java): 1. For the HttpClient set two parameters: a. ConnectionTimeout : This denotes the time elapsed before the connection established or Server responded to connection request. b. SoTimeout : Maximum period inactivity between two consecutive data packets arriving at client side after connection is established. 2. Related to item 1; apparently these two settings may not always work properly in some conditions in accordance to this bug filed against the jdk, https://bugs.openjdk.java.net/browse/JDK-8049846 . Thus for further defense; I added each api call to a FutureTask and start the task in a new thread with a limit time of 30 seconds. After 30 seconds; a concurrent Timeout exception is thrown allowing the parent thread to abort the task and the http request. This modification has been running on a private Jenkins server instance in our organization with positive results: 1. The sockets are being closed after each request from the client; causing less resources on both client / server 2. If a socket does get hung; we see the timer concurrent exception and the cron job is not being blocked I'll be opening a pull request within the next 1-2 days once the changes are documented properly inline of the code.

          james norman added a comment -

          jwstric2 I believe I'm experiencing the same issue. Can you send me the changes you made to the code to resolve this?

          Thanks -james

          james norman added a comment - jwstric2 I believe I'm experiencing the same issue. Can you send me the changes you made to the code to resolve this? Thanks -james

          jcnorman48, sorry for the late reply. See https://github.com/jenkinsci/stash-pullrequest-builder-plugin/pull/9. I fixed merge conflicts 3 times due to other commit breakages actively happening but the administrator of the plugin has yet to accept the fix.

          Jonathan Strickland added a comment - jcnorman48 , sorry for the late reply. See https://github.com/jenkinsci/stash-pullrequest-builder-plugin/pull/9 . I fixed merge conflicts 3 times due to other commit breakages actively happening but the administrator of the plugin has yet to accept the fix.

          james norman added a comment -

          jwstric2 thanks so much for the PR. I may just build off this and upload the plugin. This issue had been killing cron entirely a few times.

          james norman added a comment - jwstric2 thanks so much for the PR. I may just build off this and upload the plugin. This issue had been killing cron entirely a few times.

          jcnorman48 The Stash Pull Requester plugin was a great initial choice for us to get started. Check out https://marketplace.atlassian.com/plugins/se.bjurr.prnfs.pull-request-notifier-for-stash/server/overview for the push model; this has been serving us well for over a month now (less load on Jenkins and the stash server too). https://christiangalsterer.wordpress.com/2015/04/23/continuous-integration-for-pull-requests-with-jenkins-and-stash/ gives a good walk-through.

          Jonathan Strickland added a comment - jcnorman48 The Stash Pull Requester plugin was a great initial choice for us to get started. Check out https://marketplace.atlassian.com/plugins/se.bjurr.prnfs.pull-request-notifier-for-stash/server/overview for the push model; this has been serving us well for over a month now (less load on Jenkins and the stash server too). https://christiangalsterer.wordpress.com/2015/04/23/continuous-integration-for-pull-requests-with-jenkins-and-stash/ gives a good walk-through.

          Code changed in jenkins
          User: Jonathan Strickland
          Path:
          src/main/java/stashpullrequestbuilder/stashpullrequestbuilder/stash/StashApiClient.java
          http://jenkins-ci.org/commit/stash-pullrequest-builder-plugin/80431f422dce9ea763e59c28e35fef75113203b1
          Log:
          JENKINS-30558 - Defensive code to prevent native deadlock and cleanup socket code

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Jonathan Strickland Path: src/main/java/stashpullrequestbuilder/stashpullrequestbuilder/stash/StashApiClient.java http://jenkins-ci.org/commit/stash-pullrequest-builder-plugin/80431f422dce9ea763e59c28e35fef75113203b1 Log: JENKINS-30558 - Defensive code to prevent native deadlock and cleanup socket code

          Code changed in jenkins
          User: Jonathan Strickland
          Path:
          src/main/java/stashpullrequestbuilder/stashpullrequestbuilder/stash/StashApiClient.java
          http://jenkins-ci.org/commit/stash-pullrequest-builder-plugin/ed565dc0ea5cf42443c7ed00d802ec81ca2b6add
          Log:
          JENKINS-30558 - Add some inline comments to code changes

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Jonathan Strickland Path: src/main/java/stashpullrequestbuilder/stashpullrequestbuilder/stash/StashApiClient.java http://jenkins-ci.org/commit/stash-pullrequest-builder-plugin/ed565dc0ea5cf42443c7ed00d802ec81ca2b6add Log: JENKINS-30558 - Add some inline comments to code changes

          Code changed in jenkins
          User: Jonathan Strickland
          Path:
          README.md
          pom.xml
          src/main/java/stashpullrequestbuilder/stashpullrequestbuilder/StashBuildTrigger.java
          src/main/java/stashpullrequestbuilder/stashpullrequestbuilder/StashBuilds.java
          src/main/java/stashpullrequestbuilder/stashpullrequestbuilder/StashCause.java
          src/main/java/stashpullrequestbuilder/stashpullrequestbuilder/StashRepository.java
          src/main/java/stashpullrequestbuilder/stashpullrequestbuilder/stash/StashApiClient.java
          src/main/java/stashpullrequestbuilder/stashpullrequestbuilder/stash/StashPullRequestResponse.java
          src/main/java/stashpullrequestbuilder/stashpullrequestbuilder/stash/StashPullRequestResponseValueRepository.java
          src/main/resources/stashpullrequestbuilder/stashpullrequestbuilder/StashBuildTrigger/config.jelly
          src/test/java/stashpullrequestbuilder/stashpullrequestbuilder/stash/AdditionalParameterRegExTest.java
          http://jenkins-ci.org/commit/stash-pullrequest-builder-plugin/0a35b20e635a5a445ea4c6b0263216625dc4c719
          Log:
          Merge remote-tracking branch 'origin/master' into JENKINS-30558

          Conflicts:
          src/main/java/stashpullrequestbuilder/stashpullrequestbuilder/stash/StashApiClient.java

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Jonathan Strickland Path: README.md pom.xml src/main/java/stashpullrequestbuilder/stashpullrequestbuilder/StashBuildTrigger.java src/main/java/stashpullrequestbuilder/stashpullrequestbuilder/StashBuilds.java src/main/java/stashpullrequestbuilder/stashpullrequestbuilder/StashCause.java src/main/java/stashpullrequestbuilder/stashpullrequestbuilder/StashRepository.java src/main/java/stashpullrequestbuilder/stashpullrequestbuilder/stash/StashApiClient.java src/main/java/stashpullrequestbuilder/stashpullrequestbuilder/stash/StashPullRequestResponse.java src/main/java/stashpullrequestbuilder/stashpullrequestbuilder/stash/StashPullRequestResponseValueRepository.java src/main/resources/stashpullrequestbuilder/stashpullrequestbuilder/StashBuildTrigger/config.jelly src/test/java/stashpullrequestbuilder/stashpullrequestbuilder/stash/AdditionalParameterRegExTest.java http://jenkins-ci.org/commit/stash-pullrequest-builder-plugin/0a35b20e635a5a445ea4c6b0263216625dc4c719 Log: Merge remote-tracking branch 'origin/master' into JENKINS-30558 Conflicts: src/main/java/stashpullrequestbuilder/stashpullrequestbuilder/stash/StashApiClient.java

          Code changed in jenkins
          User: Jonathan Strickland
          Path:
          README.md
          pom.xml
          src/main/java/stashpullrequestbuilder/stashpullrequestbuilder/StashRepository.java
          src/main/java/stashpullrequestbuilder/stashpullrequestbuilder/stash/StashApiClient.java
          src/main/resources/stashpullrequestbuilder/stashpullrequestbuilder/StashBuildTrigger/config.jelly
          src/test/java/stashpullrequestbuilder/stashpullrequestbuilder/stash/AdditionalParameterRegExTest.java
          http://jenkins-ci.org/commit/stash-pullrequest-builder-plugin/b71dd2cfd9366009a39a28166e0b123e8dc6132f
          Log:
          Merge remote-tracking branch 'origin/master' into JENKINS-30558

          Conflicts:
          src/main/java/stashpullrequestbuilder/stashpullrequestbuilder/stash/StashApiClient.java

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Jonathan Strickland Path: README.md pom.xml src/main/java/stashpullrequestbuilder/stashpullrequestbuilder/StashRepository.java src/main/java/stashpullrequestbuilder/stashpullrequestbuilder/stash/StashApiClient.java src/main/resources/stashpullrequestbuilder/stashpullrequestbuilder/StashBuildTrigger/config.jelly src/test/java/stashpullrequestbuilder/stashpullrequestbuilder/stash/AdditionalParameterRegExTest.java http://jenkins-ci.org/commit/stash-pullrequest-builder-plugin/b71dd2cfd9366009a39a28166e0b123e8dc6132f Log: Merge remote-tracking branch 'origin/master' into JENKINS-30558 Conflicts: src/main/java/stashpullrequestbuilder/stashpullrequestbuilder/stash/StashApiClient.java

          Code changed in jenkins
          User: Jonathan Strickland
          Path:
          README.md
          pom.xml
          src/main/java/stashpullrequestbuilder/stashpullrequestbuilder/StashBuildTrigger.java
          src/main/java/stashpullrequestbuilder/stashpullrequestbuilder/StashBuilds.java
          src/main/java/stashpullrequestbuilder/stashpullrequestbuilder/StashCause.java
          src/main/java/stashpullrequestbuilder/stashpullrequestbuilder/StashRepository.java
          src/main/java/stashpullrequestbuilder/stashpullrequestbuilder/stash/StashApiClient.java
          src/main/resources/stashpullrequestbuilder/stashpullrequestbuilder/StashBuildTrigger/config.jelly
          src/test/java/stashpullrequestbuilder/stashpullrequestbuilder/stash/StashPullRequestResponseValueRepositoryTest.java
          http://jenkins-ci.org/commit/stash-pullrequest-builder-plugin/9db4cbb1e24eab66090a80b7e1488f2e2a6453b6
          Log:
          Merge remote-tracking branch 'origin/master' into JENKINS-30558

          Conflicts:
          src/main/java/stashpullrequestbuilder/stashpullrequestbuilder/stash/StashApiClient.java

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Jonathan Strickland Path: README.md pom.xml src/main/java/stashpullrequestbuilder/stashpullrequestbuilder/StashBuildTrigger.java src/main/java/stashpullrequestbuilder/stashpullrequestbuilder/StashBuilds.java src/main/java/stashpullrequestbuilder/stashpullrequestbuilder/StashCause.java src/main/java/stashpullrequestbuilder/stashpullrequestbuilder/StashRepository.java src/main/java/stashpullrequestbuilder/stashpullrequestbuilder/stash/StashApiClient.java src/main/resources/stashpullrequestbuilder/stashpullrequestbuilder/StashBuildTrigger/config.jelly src/test/java/stashpullrequestbuilder/stashpullrequestbuilder/stash/StashPullRequestResponseValueRepositoryTest.java http://jenkins-ci.org/commit/stash-pullrequest-builder-plugin/9db4cbb1e24eab66090a80b7e1488f2e2a6453b6 Log: Merge remote-tracking branch 'origin/master' into JENKINS-30558 Conflicts: src/main/java/stashpullrequestbuilder/stashpullrequestbuilder/stash/StashApiClient.java

          Code changed in jenkins
          User: Nathan
          Path:
          src/main/java/stashpullrequestbuilder/stashpullrequestbuilder/stash/StashApiClient.java
          http://jenkins-ci.org/commit/stash-pullrequest-builder-plugin/02ddd009ec6a34ac4ca322e7204aef3babdf89ca
          Log:
          Merge pull request #9 from jwstric2/JENKINS-30558

          Jenkins 30558

          Compare: https://github.com/jenkinsci/stash-pullrequest-builder-plugin/compare/6424c319aec0...02ddd009ec6a

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Nathan Path: src/main/java/stashpullrequestbuilder/stashpullrequestbuilder/stash/StashApiClient.java http://jenkins-ci.org/commit/stash-pullrequest-builder-plugin/02ddd009ec6a34ac4ca322e7204aef3babdf89ca Log: Merge pull request #9 from jwstric2/ JENKINS-30558 Jenkins 30558 Compare: https://github.com/jenkinsci/stash-pullrequest-builder-plugin/compare/6424c319aec0...02ddd009ec6a

          Resolved with Merge pull request #9 from jwstric2/JENKINS-30558.

          Jonathan Strickland added a comment - Resolved with Merge pull request #9 from jwstric2/ JENKINS-30558 .

          kang hao added a comment - - edited

          Hi, I have updated our plugin to the latest version, and now cron based jobs can be triggered, but not as my expected, they become irregular. who can help me ...

          kang hao added a comment - - edited Hi, I have updated our plugin to the latest version, and now cron based jobs can be triggered, but not as my expected, they become irregular. who can help me ...

          kanghao

          Kang,

          I would recommend opening up a new issue with your problem. If the upgrade of the Stash Trigger Plugin is causing issues, please note this in the new issue (to and from version)

          Regards,
          Jonathan

          Jonathan Strickland added a comment - kanghao Kang, I would recommend opening up a new issue with your problem. If the upgrade of the Stash Trigger Plugin is causing issues, please note this in the new issue (to and from version) Regards, Jonathan

          Vivian Zhang added a comment -

          kanghao I have the exact issue as you do, that is the stash pull request plugin causes the cron based jobs to be triggered randomly. In addition, these jobs also get triggered when Stash becomes unavailable, such as during backup. Hence these jobs are being triggered every day at the beginning of the Stash backup, and all of them failed of course. This must have caused by the fix in the new release. They should be refixed to not introduce new bug. Hence it's better to continue use this same issue id to complete the resolution. Did you open a new issue for it? Can I have the new issue id?

          Vivian Zhang added a comment - kanghao I have the exact issue as you do, that is the stash pull request plugin causes the cron based jobs to be triggered randomly. In addition, these jobs also get triggered when Stash becomes unavailable, such as during backup. Hence these jobs are being triggered every day at the beginning of the Stash backup, and all of them failed of course. This must have caused by the fix in the new release. They should be refixed to not introduce new bug. Hence it's better to continue use this same issue id to complete the resolution. Did you open a new issue for it? Can I have the new issue id?

          Jagadish added a comment -

          Issue is still persisting.Entire Jenkins core trigger system has been affected due to stash pull request builder plugin :|

          Jagadish added a comment - Issue is still persisting.Entire Jenkins core trigger system has been affected due to stash pull request builder plugin :|

          This issue has been automatically closed because of inactivity. Please reopen it if you think it's still valid

          Jakub Bochenski added a comment - This issue has been automatically closed because of inactivity. Please reopen it if you think it's still valid

          Pavel Roskin added a comment -

          I see that the socket timeout and the request timeout were introduced in the same PR. Does anybody know a reason why request timeouts are needed in presence of socket timeouts? Running HTTP requests in separate threads adds a lot of complexity. Other plugins doesn't do it. I assume every request should use a limited number of packets sent over the socket, so the socket timeout should put a limit on the request duration.

          I'm going to move requests back to the main thread unless I hear any objections.

          Pavel Roskin added a comment - I see that the socket timeout and the request timeout were introduced in the same PR. Does anybody know a reason why request timeouts are needed in presence of socket timeouts? Running HTTP requests in separate threads adds a lot of complexity. Other plugins doesn't do it. I assume every request should use a limited number of packets sent over the socket, so the socket timeout should put a limit on the request duration. I'm going to move requests back to the main thread unless I hear any objections.

            jwstric2 Jonathan Strickland
            jwstric2 Jonathan Strickland
            Votes:
            3 Vote for this issue
            Watchers:
            12 Start watching this issue

              Created:
              Updated:
              Resolved: