Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-57086

Stuck, hanging, unkillable jobs in Jenkins

    XMLWordPrintable

Details

    • Bug
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Fixed
    • core, slack-plugin
    • None

    Description

      I have a job in the queue for 21 days now. According to the log, the build has failed:

      Mar 26, 2019 7:38:04 PM hudson.model.Run execute
      INFO: 0 Update R package docs #11580 main build action completed: FAILURE
      Mar 26, 2019 7:38:04 PM jenkins.plugins.slack.SlackNotifier perform
      INFO: Performing complete notifications
      

      But the job is still on the queue as running.

      I've tried everthing I could read on https://stackoverflow.com/questions/14456592/how-to-stop-an-unstoppable-zombie-job-on-jenkins-without-restarting-the-server, but it's still there.

      I will restart the server next monday, but opening this issue in the hope that something can be done.

      I couldn't find any other lines related to this job in the logs. The script itself terminated with an exit code different than zero (so it exited, was not hanging around).

      I've attached a gist with the /threadDump output. Searching for the job's name gives:

      Executor #10 for master : executing 0 Update R package docs #11580
              "Executor #10 for master : executing 0 Update R package docs #11580" Id=1961561 Group=main RUNNABLE (in native)
              	at java.net.SocketInputStream.socketRead0(Native Method)
              	at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
              	at java.net.SocketInputStream.read(SocketInputStream.java:171)
              	at java.net.SocketInputStream.read(SocketInputStream.java:141)
              	at sun.security.ssl.InputRecord.readFully(InputRecord.java:465)
              	at sun.security.ssl.InputRecord.read(InputRecord.java:503)
              	at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:975)
              	-  locked java.lang.Object@2644728a
              	at sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1367)
              	-  locked java.lang.Object@18cea778
              	at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1395)
              	at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1379)
              	at org.apache.http.conn.ssl.SSLConnectionSocketFactory.createLayeredSocket(SSLConnectionSocketFactory.java:275)
              	at org.apache.http.conn.ssl.SSLConnectionSocketFactory.connectSocket(SSLConnectionSocketFactory.java:254)
              	at org.apache.http.impl.conn.HttpClientConnectionOperator.connect(HttpClientConnectionOperator.java:118)
              	at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:314)
              	at org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:363)
              	at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:219)
              	at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:195)
              	at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:85)
              	at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:108)
              	at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:186)
              	at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
              	at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:106)
              	at jenkins.plugins.slack.StandardSlackService.publish(StandardSlackService.java:163)
              	at jenkins.plugins.slack.StandardSlackService.publish(StandardSlackService.java:104)
              	at jenkins.plugins.slack.ActiveNotifier.completed(ActiveNotifier.java:150)
              	at jenkins.plugins.slack.SlackNotifier.perform(SlackNotifier.java:444)
              	at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
              	at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:744)
              	at hudson.model.AbstractBuild$AbstractBuildExecution.performAllBuildSteps(AbstractBuild.java:690)
              	at hudson.model.Build$BuildExecution.cleanUp(Build.java:196)
              	at hudson.model.Run.execute(Run.java:1863)
              	at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
              	at hudson.model.ResourceController.execute(ResourceController.java:97)
              	at hudson.model.Executor.run(Executor.java:429)

      Inspecting this, the slack notification plugin becomes the suspect.

      Doing a netstat on the machine gives a lingering connection there to 99.84.75.163:443. After killing it with the following command:

      ss -K dst 99.84.75.163 dport = 443

      the job (and the associated thread in the thread dump) immediately disappeared.

      Attachments

        Activity

          bra Attila Nagy created issue -
          bra Attila Nagy made changes -
          Field Original Value New Value
          Description I have a job which is stuck in the queue for 21 days now. According to the log, the build was failed:

           
          {noformat}
          Mar 26, 2019 7:38:04 PM hudson.model.Run execute
          INFO: 0 Update R package docs #11580 main build action completed: FAILURE
          Mar 26, 2019 7:38:04 PM jenkins.plugins.slack.SlackNotifier perform
          INFO: Performing complete notifications
          {noformat}
          But the job is still on the queue as running.

          I've tried everthing I could read on [https://stackoverflow.com/questions/14456592/how-to-stop-an-unstoppable-zombie-job-on-jenkins-without-restarting-the-server,] but it's still there.

          I will restart the server next monday, but opening this issue in the hope that something can be done.

          I couldn't find any other lines related to this job in the logs. The script itself terminated with an exit code different than zero (so it exited, was not hanging around).

          I've attached a gist with the /threadDump output. Searching for the job's name gives:
          {noformat}
          Executor #10 for master : executing 0 Update R package docs #11580
                
                
                  

                
                
                  "Executor #10 for master : executing 0 Update R package docs #11580" Id=1961561 Group=main RUNNABLE (in native)
                
                
                   at java.net.SocketInputStream.socketRead0(Native Method)
                
                
                   at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
                
                
                   at java.net.SocketInputStream.read(SocketInputStream.java:171)
                
                
                   at java.net.SocketInputStream.read(SocketInputStream.java:141)
                
                
                   at sun.security.ssl.InputRecord.readFully(InputRecord.java:465)
                
                
                   at sun.security.ssl.InputRecord.read(InputRecord.java:503)
                
                
                   at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:975)
                
                
                   - locked java.lang.Object@2644728a
                
                
                   at sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1367)
                
                
                   - locked java.lang.Object@18cea778
                
                
                   at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1395)
                
                
                   at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1379)
                
                
                   at org.apache.http.conn.ssl.SSLConnectionSocketFactory.createLayeredSocket(SSLConnectionSocketFactory.java:275)
                
                
                   at org.apache.http.conn.ssl.SSLConnectionSocketFactory.connectSocket(SSLConnectionSocketFactory.java:254)
                
                
                   at org.apache.http.impl.conn.HttpClientConnectionOperator.connect(HttpClientConnectionOperator.java:118)
                
                
                   at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:314)
                
                
                   at org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:363)
                
                
                   at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:219)
                
                
                   at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:195)
                
                
                   at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:85)
                
                
                   at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:108)
                
                
                   at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:186)
                
                
                   at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
                
                
                   at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:106)
                
                
                   at jenkins.plugins.slack.StandardSlackService.publish(StandardSlackService.java:163)
                
                
                   at jenkins.plugins.slack.StandardSlackService.publish(StandardSlackService.java:104)
                
                
                   at jenkins.plugins.slack.ActiveNotifier.completed(ActiveNotifier.java:150)
                
                
                   at jenkins.plugins.slack.SlackNotifier.perform(SlackNotifier.java:444)
                
                
                   at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
                
                
                   at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:744)
                
                
                   at hudson.model.AbstractBuild$AbstractBuildExecution.performAllBuildSteps(AbstractBuild.java:690)
                
                
                   at hudson.model.Build$BuildExecution.cleanUp(Build.java:196)
                
                
                   at hudson.model.Run.execute(Run.java:1863)
                
                
                   at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
                
                
                   at hudson.model.ResourceController.execute(ResourceController.java:97)
                
                
                   at hudson.model.Executor.run(Executor.java:429){noformat}
          I have a job in the queue for 21 days now. According to the log, the build was failed:
          {noformat}
          Mar 26, 2019 7:38:04 PM hudson.model.Run execute
          INFO: 0 Update R package docs #11580 main build action completed: FAILURE
          Mar 26, 2019 7:38:04 PM jenkins.plugins.slack.SlackNotifier perform
          INFO: Performing complete notifications
          {noformat}
          But the job is still on the queue as running.

          I've tried everthing I could read on [https://stackoverflow.com/questions/14456592/how-to-stop-an-unstoppable-zombie-job-on-jenkins-without-restarting-the-server,] but it's still there.

          I will restart the server next monday, but opening this issue in the hope that something can be done.

          I couldn't find any other lines related to this job in the logs. The script itself terminated with an exit code different than zero (so it exited, was not hanging around).

          I've attached a gist with the /threadDump output. Searching for the job's name gives:
          {noformat}
          Executor #10 for master : executing 0 Update R package docs #11580
                
                
                  

                
                
                  "Executor #10 for master : executing 0 Update R package docs #11580" Id=1961561 Group=main RUNNABLE (in native)
                
                
                   at java.net.SocketInputStream.socketRead0(Native Method)
                
                
                   at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
                
                
                   at java.net.SocketInputStream.read(SocketInputStream.java:171)
                
                
                   at java.net.SocketInputStream.read(SocketInputStream.java:141)
                
                
                   at sun.security.ssl.InputRecord.readFully(InputRecord.java:465)
                
                
                   at sun.security.ssl.InputRecord.read(InputRecord.java:503)
                
                
                   at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:975)
                
                
                   - locked java.lang.Object@2644728a
                
                
                   at sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1367)
                
                
                   - locked java.lang.Object@18cea778
                
                
                   at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1395)
                
                
                   at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1379)
                
                
                   at org.apache.http.conn.ssl.SSLConnectionSocketFactory.createLayeredSocket(SSLConnectionSocketFactory.java:275)
                
                
                   at org.apache.http.conn.ssl.SSLConnectionSocketFactory.connectSocket(SSLConnectionSocketFactory.java:254)
                
                
                   at org.apache.http.impl.conn.HttpClientConnectionOperator.connect(HttpClientConnectionOperator.java:118)
                
                
                   at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:314)
                
                
                   at org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:363)
                
                
                   at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:219)
                
                
                   at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:195)
                
                
                   at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:85)
                
                
                   at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:108)
                
                
                   at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:186)
                
                
                   at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
                
                
                   at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:106)
                
                
                   at jenkins.plugins.slack.StandardSlackService.publish(StandardSlackService.java:163)
                
                
                   at jenkins.plugins.slack.StandardSlackService.publish(StandardSlackService.java:104)
                
                
                   at jenkins.plugins.slack.ActiveNotifier.completed(ActiveNotifier.java:150)
                
                
                   at jenkins.plugins.slack.SlackNotifier.perform(SlackNotifier.java:444)
                
                
                   at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
                
                
                   at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:744)
                
                
                   at hudson.model.AbstractBuild$AbstractBuildExecution.performAllBuildSteps(AbstractBuild.java:690)
                
                
                   at hudson.model.Build$BuildExecution.cleanUp(Build.java:196)
                
                
                   at hudson.model.Run.execute(Run.java:1863)
                
                
                   at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
                
                
                   at hudson.model.ResourceController.execute(ResourceController.java:97)
                
                
                   at hudson.model.Executor.run(Executor.java:429){noformat}
          bra Attila Nagy made changes -
          Description I have a job in the queue for 21 days now. According to the log, the build was failed:
          {noformat}
          Mar 26, 2019 7:38:04 PM hudson.model.Run execute
          INFO: 0 Update R package docs #11580 main build action completed: FAILURE
          Mar 26, 2019 7:38:04 PM jenkins.plugins.slack.SlackNotifier perform
          INFO: Performing complete notifications
          {noformat}
          But the job is still on the queue as running.

          I've tried everthing I could read on [https://stackoverflow.com/questions/14456592/how-to-stop-an-unstoppable-zombie-job-on-jenkins-without-restarting-the-server,] but it's still there.

          I will restart the server next monday, but opening this issue in the hope that something can be done.

          I couldn't find any other lines related to this job in the logs. The script itself terminated with an exit code different than zero (so it exited, was not hanging around).

          I've attached a gist with the /threadDump output. Searching for the job's name gives:
          {noformat}
          Executor #10 for master : executing 0 Update R package docs #11580
                
                
                  

                
                
                  "Executor #10 for master : executing 0 Update R package docs #11580" Id=1961561 Group=main RUNNABLE (in native)
                
                
                   at java.net.SocketInputStream.socketRead0(Native Method)
                
                
                   at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
                
                
                   at java.net.SocketInputStream.read(SocketInputStream.java:171)
                
                
                   at java.net.SocketInputStream.read(SocketInputStream.java:141)
                
                
                   at sun.security.ssl.InputRecord.readFully(InputRecord.java:465)
                
                
                   at sun.security.ssl.InputRecord.read(InputRecord.java:503)
                
                
                   at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:975)
                
                
                   - locked java.lang.Object@2644728a
                
                
                   at sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1367)
                
                
                   - locked java.lang.Object@18cea778
                
                
                   at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1395)
                
                
                   at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1379)
                
                
                   at org.apache.http.conn.ssl.SSLConnectionSocketFactory.createLayeredSocket(SSLConnectionSocketFactory.java:275)
                
                
                   at org.apache.http.conn.ssl.SSLConnectionSocketFactory.connectSocket(SSLConnectionSocketFactory.java:254)
                
                
                   at org.apache.http.impl.conn.HttpClientConnectionOperator.connect(HttpClientConnectionOperator.java:118)
                
                
                   at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:314)
                
                
                   at org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:363)
                
                
                   at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:219)
                
                
                   at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:195)
                
                
                   at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:85)
                
                
                   at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:108)
                
                
                   at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:186)
                
                
                   at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
                
                
                   at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:106)
                
                
                   at jenkins.plugins.slack.StandardSlackService.publish(StandardSlackService.java:163)
                
                
                   at jenkins.plugins.slack.StandardSlackService.publish(StandardSlackService.java:104)
                
                
                   at jenkins.plugins.slack.ActiveNotifier.completed(ActiveNotifier.java:150)
                
                
                   at jenkins.plugins.slack.SlackNotifier.perform(SlackNotifier.java:444)
                
                
                   at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
                
                
                   at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:744)
                
                
                   at hudson.model.AbstractBuild$AbstractBuildExecution.performAllBuildSteps(AbstractBuild.java:690)
                
                
                   at hudson.model.Build$BuildExecution.cleanUp(Build.java:196)
                
                
                   at hudson.model.Run.execute(Run.java:1863)
                
                
                   at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
                
                
                   at hudson.model.ResourceController.execute(ResourceController.java:97)
                
                
                   at hudson.model.Executor.run(Executor.java:429){noformat}
          I have a job in the queue for 21 days now. According to the log, the build has failed:
          {noformat}
          Mar 26, 2019 7:38:04 PM hudson.model.Run execute
          INFO: 0 Update R package docs #11580 main build action completed: FAILURE
          Mar 26, 2019 7:38:04 PM jenkins.plugins.slack.SlackNotifier perform
          INFO: Performing complete notifications
          {noformat}
          But the job is still on the queue as running.

          I've tried everthing I could read on [https://stackoverflow.com/questions/14456592/how-to-stop-an-unstoppable-zombie-job-on-jenkins-without-restarting-the-server,] but it's still there.

          I will restart the server next monday, but opening this issue in the hope that something can be done.

          I couldn't find any other lines related to this job in the logs. The script itself terminated with an exit code different than zero (so it exited, was not hanging around).

          I've attached a gist with the /threadDump output. Searching for the job's name gives:
          {noformat}
          Executor #10 for master : executing 0 Update R package docs #11580
                
                
                  

                
                
                  "Executor #10 for master : executing 0 Update R package docs #11580" Id=1961561 Group=main RUNNABLE (in native)
                
                
                   at java.net.SocketInputStream.socketRead0(Native Method)
                
                
                   at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
                
                
                   at java.net.SocketInputStream.read(SocketInputStream.java:171)
                
                
                   at java.net.SocketInputStream.read(SocketInputStream.java:141)
                
                
                   at sun.security.ssl.InputRecord.readFully(InputRecord.java:465)
                
                
                   at sun.security.ssl.InputRecord.read(InputRecord.java:503)
                
                
                   at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:975)
                
                
                   - locked java.lang.Object@2644728a
                
                
                   at sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1367)
                
                
                   - locked java.lang.Object@18cea778
                
                
                   at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1395)
                
                
                   at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1379)
                
                
                   at org.apache.http.conn.ssl.SSLConnectionSocketFactory.createLayeredSocket(SSLConnectionSocketFactory.java:275)
                
                
                   at org.apache.http.conn.ssl.SSLConnectionSocketFactory.connectSocket(SSLConnectionSocketFactory.java:254)
                
                
                   at org.apache.http.impl.conn.HttpClientConnectionOperator.connect(HttpClientConnectionOperator.java:118)
                
                
                   at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:314)
                
                
                   at org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:363)
                
                
                   at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:219)
                
                
                   at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:195)
                
                
                   at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:85)
                
                
                   at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:108)
                
                
                   at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:186)
                
                
                   at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
                
                
                   at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:106)
                
                
                   at jenkins.plugins.slack.StandardSlackService.publish(StandardSlackService.java:163)
                
                
                   at jenkins.plugins.slack.StandardSlackService.publish(StandardSlackService.java:104)
                
                
                   at jenkins.plugins.slack.ActiveNotifier.completed(ActiveNotifier.java:150)
                
                
                   at jenkins.plugins.slack.SlackNotifier.perform(SlackNotifier.java:444)
                
                
                   at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
                
                
                   at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:744)
                
                
                   at hudson.model.AbstractBuild$AbstractBuildExecution.performAllBuildSteps(AbstractBuild.java:690)
                
                
                   at hudson.model.Build$BuildExecution.cleanUp(Build.java:196)
                
                
                   at hudson.model.Run.execute(Run.java:1863)
                
                
                   at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
                
                
                   at hudson.model.ResourceController.execute(ResourceController.java:97)
                
                
                   at hudson.model.Executor.run(Executor.java:429){noformat}
          bra Attila Nagy made changes -
          Description I have a job in the queue for 21 days now. According to the log, the build has failed:
          {noformat}
          Mar 26, 2019 7:38:04 PM hudson.model.Run execute
          INFO: 0 Update R package docs #11580 main build action completed: FAILURE
          Mar 26, 2019 7:38:04 PM jenkins.plugins.slack.SlackNotifier perform
          INFO: Performing complete notifications
          {noformat}
          But the job is still on the queue as running.

          I've tried everthing I could read on [https://stackoverflow.com/questions/14456592/how-to-stop-an-unstoppable-zombie-job-on-jenkins-without-restarting-the-server,] but it's still there.

          I will restart the server next monday, but opening this issue in the hope that something can be done.

          I couldn't find any other lines related to this job in the logs. The script itself terminated with an exit code different than zero (so it exited, was not hanging around).

          I've attached a gist with the /threadDump output. Searching for the job's name gives:
          {noformat}
          Executor #10 for master : executing 0 Update R package docs #11580
                
                
                  

                
                
                  "Executor #10 for master : executing 0 Update R package docs #11580" Id=1961561 Group=main RUNNABLE (in native)
                
                
                   at java.net.SocketInputStream.socketRead0(Native Method)
                
                
                   at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
                
                
                   at java.net.SocketInputStream.read(SocketInputStream.java:171)
                
                
                   at java.net.SocketInputStream.read(SocketInputStream.java:141)
                
                
                   at sun.security.ssl.InputRecord.readFully(InputRecord.java:465)
                
                
                   at sun.security.ssl.InputRecord.read(InputRecord.java:503)
                
                
                   at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:975)
                
                
                   - locked java.lang.Object@2644728a
                
                
                   at sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1367)
                
                
                   - locked java.lang.Object@18cea778
                
                
                   at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1395)
                
                
                   at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1379)
                
                
                   at org.apache.http.conn.ssl.SSLConnectionSocketFactory.createLayeredSocket(SSLConnectionSocketFactory.java:275)
                
                
                   at org.apache.http.conn.ssl.SSLConnectionSocketFactory.connectSocket(SSLConnectionSocketFactory.java:254)
                
                
                   at org.apache.http.impl.conn.HttpClientConnectionOperator.connect(HttpClientConnectionOperator.java:118)
                
                
                   at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:314)
                
                
                   at org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:363)
                
                
                   at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:219)
                
                
                   at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:195)
                
                
                   at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:85)
                
                
                   at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:108)
                
                
                   at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:186)
                
                
                   at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
                
                
                   at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:106)
                
                
                   at jenkins.plugins.slack.StandardSlackService.publish(StandardSlackService.java:163)
                
                
                   at jenkins.plugins.slack.StandardSlackService.publish(StandardSlackService.java:104)
                
                
                   at jenkins.plugins.slack.ActiveNotifier.completed(ActiveNotifier.java:150)
                
                
                   at jenkins.plugins.slack.SlackNotifier.perform(SlackNotifier.java:444)
                
                
                   at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
                
                
                   at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:744)
                
                
                   at hudson.model.AbstractBuild$AbstractBuildExecution.performAllBuildSteps(AbstractBuild.java:690)
                
                
                   at hudson.model.Build$BuildExecution.cleanUp(Build.java:196)
                
                
                   at hudson.model.Run.execute(Run.java:1863)
                
                
                   at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
                
                
                   at hudson.model.ResourceController.execute(ResourceController.java:97)
                
                
                   at hudson.model.Executor.run(Executor.java:429){noformat}
          I have a job in the queue for 21 days now. According to the log, the build has failed:
          {noformat}
          Mar 26, 2019 7:38:04 PM hudson.model.Run execute
          INFO: 0 Update R package docs #11580 main build action completed: FAILURE
          Mar 26, 2019 7:38:04 PM jenkins.plugins.slack.SlackNotifier perform
          INFO: Performing complete notifications
          {noformat}
          But the job is still on the queue as running.

          I've tried everthing I could read on [https://stackoverflow.com/questions/14456592/how-to-stop-an-unstoppable-zombie-job-on-jenkins-without-restarting-the-server,] but it's still there.

          I will restart the server next monday, but opening this issue in the hope that something can be done.

          I couldn't find any other lines related to this job in the logs. The script itself terminated with an exit code different than zero (so it exited, was not hanging around).

          I've attached a gist with the /threadDump output. Searching for the job's name gives:
          {noformat}
          Executor #10 for master : executing 0 Update R package docs #11580
                  "Executor #10 for master : executing 0 Update R package docs #11580" Id=1961561 Group=main RUNNABLE (in native)
                   at java.net.SocketInputStream.socketRead0(Native Method)
                   at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
                   at java.net.SocketInputStream.read(SocketInputStream.java:171)
                   at java.net.SocketInputStream.read(SocketInputStream.java:141)
                   at sun.security.ssl.InputRecord.readFully(InputRecord.java:465)
                   at sun.security.ssl.InputRecord.read(InputRecord.java:503)
                   at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:975)
                   - locked java.lang.Object@2644728a
                   at sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1367)
                   - locked java.lang.Object@18cea778
                   at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1395)
                   at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1379)
                   at org.apache.http.conn.ssl.SSLConnectionSocketFactory.createLayeredSocket(SSLConnectionSocketFactory.java:275)
                   at org.apache.http.conn.ssl.SSLConnectionSocketFactory.connectSocket(SSLConnectionSocketFactory.java:254)
                   at org.apache.http.impl.conn.HttpClientConnectionOperator.connect(HttpClientConnectionOperator.java:118)
                   at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:314)
                   at org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:363)
                   at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:219)
                   at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:195)
                   at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:85)
                   at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:108)
                   at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:186)
                   at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
                   at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:106)
                   at jenkins.plugins.slack.StandardSlackService.publish(StandardSlackService.java:163)
                   at jenkins.plugins.slack.StandardSlackService.publish(StandardSlackService.java:104)
                   at jenkins.plugins.slack.ActiveNotifier.completed(ActiveNotifier.java:150)
                   at jenkins.plugins.slack.SlackNotifier.perform(SlackNotifier.java:444)
                   at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
                   at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:744)
                   at hudson.model.AbstractBuild$AbstractBuildExecution.performAllBuildSteps(AbstractBuild.java:690)
                   at hudson.model.Build$BuildExecution.cleanUp(Build.java:196)
                   at hudson.model.Run.execute(Run.java:1863)
                   at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
                   at hudson.model.ResourceController.execute(ResourceController.java:97)
                   at hudson.model.Executor.run(Executor.java:429){noformat}
           
          bra Attila Nagy made changes -
          Component/s slack-plugin [ 18321 ]
          bra Attila Nagy made changes -
          Description I have a job in the queue for 21 days now. According to the log, the build has failed:
          {noformat}
          Mar 26, 2019 7:38:04 PM hudson.model.Run execute
          INFO: 0 Update R package docs #11580 main build action completed: FAILURE
          Mar 26, 2019 7:38:04 PM jenkins.plugins.slack.SlackNotifier perform
          INFO: Performing complete notifications
          {noformat}
          But the job is still on the queue as running.

          I've tried everthing I could read on [https://stackoverflow.com/questions/14456592/how-to-stop-an-unstoppable-zombie-job-on-jenkins-without-restarting-the-server,] but it's still there.

          I will restart the server next monday, but opening this issue in the hope that something can be done.

          I couldn't find any other lines related to this job in the logs. The script itself terminated with an exit code different than zero (so it exited, was not hanging around).

          I've attached a gist with the /threadDump output. Searching for the job's name gives:
          {noformat}
          Executor #10 for master : executing 0 Update R package docs #11580
                  "Executor #10 for master : executing 0 Update R package docs #11580" Id=1961561 Group=main RUNNABLE (in native)
                   at java.net.SocketInputStream.socketRead0(Native Method)
                   at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
                   at java.net.SocketInputStream.read(SocketInputStream.java:171)
                   at java.net.SocketInputStream.read(SocketInputStream.java:141)
                   at sun.security.ssl.InputRecord.readFully(InputRecord.java:465)
                   at sun.security.ssl.InputRecord.read(InputRecord.java:503)
                   at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:975)
                   - locked java.lang.Object@2644728a
                   at sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1367)
                   - locked java.lang.Object@18cea778
                   at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1395)
                   at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1379)
                   at org.apache.http.conn.ssl.SSLConnectionSocketFactory.createLayeredSocket(SSLConnectionSocketFactory.java:275)
                   at org.apache.http.conn.ssl.SSLConnectionSocketFactory.connectSocket(SSLConnectionSocketFactory.java:254)
                   at org.apache.http.impl.conn.HttpClientConnectionOperator.connect(HttpClientConnectionOperator.java:118)
                   at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:314)
                   at org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:363)
                   at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:219)
                   at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:195)
                   at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:85)
                   at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:108)
                   at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:186)
                   at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
                   at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:106)
                   at jenkins.plugins.slack.StandardSlackService.publish(StandardSlackService.java:163)
                   at jenkins.plugins.slack.StandardSlackService.publish(StandardSlackService.java:104)
                   at jenkins.plugins.slack.ActiveNotifier.completed(ActiveNotifier.java:150)
                   at jenkins.plugins.slack.SlackNotifier.perform(SlackNotifier.java:444)
                   at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
                   at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:744)
                   at hudson.model.AbstractBuild$AbstractBuildExecution.performAllBuildSteps(AbstractBuild.java:690)
                   at hudson.model.Build$BuildExecution.cleanUp(Build.java:196)
                   at hudson.model.Run.execute(Run.java:1863)
                   at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
                   at hudson.model.ResourceController.execute(ResourceController.java:97)
                   at hudson.model.Executor.run(Executor.java:429){noformat}
           
          I have a job in the queue for 21 days now. According to the log, the build has failed:
          {noformat}
          Mar 26, 2019 7:38:04 PM hudson.model.Run execute
          INFO: 0 Update R package docs #11580 main build action completed: FAILURE
          Mar 26, 2019 7:38:04 PM jenkins.plugins.slack.SlackNotifier perform
          INFO: Performing complete notifications
          {noformat}
          But the job is still on the queue as running.

          I've tried everthing I could read on [https://stackoverflow.com/questions/14456592/how-to-stop-an-unstoppable-zombie-job-on-jenkins-without-restarting-the-server,] but it's still there.

          I will restart the server next monday, but opening this issue in the hope that something can be done.

          I couldn't find any other lines related to this job in the logs. The script itself terminated with an exit code different than zero (so it exited, was not hanging around).

          I've attached a gist with the /threadDump output. Searching for the job's name gives:
          {noformat}
          Executor #10 for master : executing 0 Update R package docs #11580
                  "Executor #10 for master : executing 0 Update R package docs #11580" Id=1961561 Group=main RUNNABLE (in native)
                   at java.net.SocketInputStream.socketRead0(Native Method)
                   at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
                   at java.net.SocketInputStream.read(SocketInputStream.java:171)
                   at java.net.SocketInputStream.read(SocketInputStream.java:141)
                   at sun.security.ssl.InputRecord.readFully(InputRecord.java:465)
                   at sun.security.ssl.InputRecord.read(InputRecord.java:503)
                   at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:975)
                   - locked java.lang.Object@2644728a
                   at sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1367)
                   - locked java.lang.Object@18cea778
                   at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1395)
                   at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1379)
                   at org.apache.http.conn.ssl.SSLConnectionSocketFactory.createLayeredSocket(SSLConnectionSocketFactory.java:275)
                   at org.apache.http.conn.ssl.SSLConnectionSocketFactory.connectSocket(SSLConnectionSocketFactory.java:254)
                   at org.apache.http.impl.conn.HttpClientConnectionOperator.connect(HttpClientConnectionOperator.java:118)
                   at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:314)
                   at org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:363)
                   at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:219)
                   at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:195)
                   at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:85)
                   at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:108)
                   at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:186)
                   at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
                   at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:106)
                   at jenkins.plugins.slack.StandardSlackService.publish(StandardSlackService.java:163)
                   at jenkins.plugins.slack.StandardSlackService.publish(StandardSlackService.java:104)
                   at jenkins.plugins.slack.ActiveNotifier.completed(ActiveNotifier.java:150)
                   at jenkins.plugins.slack.SlackNotifier.perform(SlackNotifier.java:444)
                   at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
                   at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:744)
                   at hudson.model.AbstractBuild$AbstractBuildExecution.performAllBuildSteps(AbstractBuild.java:690)
                   at hudson.model.Build$BuildExecution.cleanUp(Build.java:196)
                   at hudson.model.Run.execute(Run.java:1863)
                   at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
                   at hudson.model.ResourceController.execute(ResourceController.java:97)
                   at hudson.model.Executor.run(Executor.java:429){noformat}
          Inspecting this, the slack notification plugin becomes the suspect.

          Doing a netstat on the machine gives a lingering connection there to 99.84.75.163:443. After killing it with the following command:
          {noformat}
          ss -K dst 99.84.75.163 dport = 443{noformat}
          the job (and the associated thread in the thread dump) immediately disappeared.
          timja Tim Jacomb made changes -
          Resolution Fixed [ 1 ]
          Status Open [ 1 ] Fixed but Unreleased [ 10203 ]
          timja Tim Jacomb made changes -
          Status Fixed but Unreleased [ 10203 ] Closed [ 6 ]

          People

            Unassigned Unassigned
            bra Attila Nagy
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: