Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-68656

SSH Slaves Plugin Deadlock while spinning up a new agent

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Major Major
    • None
    • Jenkins 2.332.3, OpenJDK 11.0.15, running on Ubuntu 20.04
      SSH Slaves Plugin 1.814.vc82988f54b_10 (tested with 1.33.0 as well)
      Anka Build Plugin 2.7.0
    • 1.821.vd834f8a_c390e

      The error observed is agents simply hanging while starting. This happens about 5% of the VMs started in this manner.

      Anka Build plugin is used and the VM which is spun by it is 100% functional.

      Investigating the tread dump shows a deadlock between launch and 

      teardownConncetion methods in SSHLauncher.

      I have attached stack trace of both threads as files.

       

      The launch method seems to be hanging while executing this:
      java.lang.Thread.State: TIMED_WAITING (on object monitor)
      at java.lang.Object.wait(java.base@11.0.15/Native Method)

      • waiting on <no object reference available>
        at hudson.remoting.Request.call(Request.java:177)
      • waiting to re-lock in wait() <0x00000005f9721350> (a hudson.remoting.UserRequest)
        at hudson.remoting.Channel.call(Channel.java:999)
        at hudson.FilePath.act(FilePath.java:1194)
        at hudson.FilePath.act(FilePath.java:1183)
        at hudson.FilePath.exists(FilePath.java:1748)
        at jenkins.branch.WorkspaceLocatorImpl.load(WorkspaceLocatorImpl.java:254)
        at jenkins.branch.WorkspaceLocatorImpl.access$500(WorkspaceLocatorImpl.java:86)
        at jenkins.branch.WorkspaceLocatorImpl$Collector.onOnline(WorkspaceLocatorImpl.java:601)
      • locked <0x00000005f97214e0> (a java.lang.String)
        at hudson.slaves.SlaveComputer.setChannel(SlaveComputer.java:727)
        at hudson.slaves.SlaveComputer.setChannel(SlaveComputer.java:437)
        at hudson.plugins.sshslaves.SSHLauncher.startAgent(SSHLauncher.java:645)
        at hudson.plugins.sshslaves.SSHLauncher.lambda$launch$0(SSHLauncher.java:458)
        at hudson.plugins.sshslaves.SSHLauncher$$Lambda$393/0x0000000840c2c040.call(Unknown Source)
        at java.util.concurrent.FutureTask.run(java.base@11.0.15/FutureTask.java:264)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.15/ThreadPoolExecutor.java:1128)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.15/ThreadPoolExecutor.java:628)
        at java.lang.Thread.run(java.base@11.0.15/Thread.java:829)

          [JENKINS-68656] SSH Slaves Plugin Deadlock while spinning up a new agent

          niv keidan created issue -

          Ivan Fernandez Calvo added a comment - - edited

          Does it happen with SSH Agents not launched with the Anka plugin?
          Do you have the logs of one of those agents to see at which stage of the connection is falling?

          Ivan Fernandez Calvo added a comment - - edited Does it happen with SSH Agents not launched with the Anka plugin? Do you have the logs of one of those agents to see at which stage of the connection is falling?
          Ivan Fernandez Calvo made changes -
          Component/s New: anka-build-plugin [ 23042 ]

          niv keidan added a comment -

          We just found out that executing sudo kill -9 <pid> for SSHD process for that specific connection on a VM, will result in channel failure Jenkins will recognize that channel is broken and clean everything up.

          Agent log(does not look 100% the same every time):

          [06/01/22 11:06:51] [SSH] Checking java version of /Library/Java/JavaVirtualMachines/temurin-11.jdk/Contents/Home//bin/java
          [06/01/22 11:06:51] [SSH] /Library/Java/JavaVirtualMachines/temurin-11.jdk/Contents/Home//bin/java -version returned 11.0.14.
          [06/01/22 11:06:51] [SSH] Starting sftp client.
          [06/01/22 11:06:51] [SSH] Copying latest remoting.jar...
          [06/01/22 11:06:52] [SSH] Copied 1,524,115 bytes.
          Expanded the channel window size to 4MB
          [06/01/22 11:06:52] [SSH] Starting agent process: cd "/usr/local/mobile/mnt/workspaces" && /Library/Java/JavaVirtualMachines/temurin-11.jdk/Contents/Home//bin/java -jar remoting.jar -workDir /usr/local/mobile/mnt/workspaces -jar-cache /usr/local/mobile/mnt/workspaces/remoting/jarCache
          Jun 01, 2022 11:06:52 AM org.jenkinsci.remoting.engine.WorkDirManager initializeWorkDir
          INFO: Using /usr/local/mobile/mnt/workspaces/remoting as a remoting work directory
          Jun 01, 2022 11:06:53 AM org.jenkinsci.remoting.engine.WorkDirManager setupLogging
          INFO: Both error and output logs will be printed to /usr/local/mobile/mnt/workspaces/remoting
          <===[JENKINS REMOTING CAPACITY]===>channel started
          Remoting version: 4.13
          This is a Unix agent
          WARNING: An illegal reflective access operation has occurred
          WARNING: Illegal reflective access by jenkins.slaves.StandardOutputSwapper$ChannelSwapper to constructor java.io.FileDescriptor(int)
          WARNING: Please consider reporting this to the maintainers of jenkins.slaves.StandardOutputSwapper$ChannelSwapper
          WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
          WARNING: All illegal access operations will be denied in a future release
          Evacuated stdout
          Jun 01, 2022 11:15:54 AM hudson.slaves.ChannelPinger$1 onDead
          INFO: Ping failed. Terminating the channel channel.
          java.util.concurrent.TimeoutException: Ping started at 1654081914259 hasn't completed by 1654082154266
          at hudson.remoting.PingThread.ping(PingThread.java:132)
          at hudson.remoting.PingThread.run(PingThread.java:88)

          niv keidan added a comment - We just found out that executing sudo kill -9 <pid> for SSHD process for that specific connection on a VM, will result in channel failure Jenkins will recognize that channel is broken and clean everything up. Agent log(does not look 100% the same every time): [06/01/22 11:06:51] [SSH] Checking java version of /Library/Java/JavaVirtualMachines/temurin-11.jdk/Contents/Home//bin/java [06/01/22 11:06:51] [SSH] /Library/Java/JavaVirtualMachines/temurin-11.jdk/Contents/Home//bin/java -version returned 11.0.14. [06/01/22 11:06:51] [SSH] Starting sftp client. [06/01/22 11:06:51] [SSH] Copying latest remoting.jar... [06/01/22 11:06:52] [SSH] Copied 1,524,115 bytes. Expanded the channel window size to 4MB [06/01/22 11:06:52] [SSH] Starting agent process: cd "/usr/local/mobile/mnt/workspaces" && /Library/Java/JavaVirtualMachines/temurin-11.jdk/Contents/Home//bin/java -jar remoting.jar -workDir /usr/local/mobile/mnt/workspaces -jar-cache /usr/local/mobile/mnt/workspaces/remoting/jarCache Jun 01, 2022 11:06:52 AM org.jenkinsci.remoting.engine.WorkDirManager initializeWorkDir INFO: Using /usr/local/mobile/mnt/workspaces/remoting as a remoting work directory Jun 01, 2022 11:06:53 AM org.jenkinsci.remoting.engine.WorkDirManager setupLogging INFO: Both error and output logs will be printed to /usr/local/mobile/mnt/workspaces/remoting <=== [JENKINS REMOTING CAPACITY] ===>channel started Remoting version: 4.13 This is a Unix agent WARNING: An illegal reflective access operation has occurred WARNING: Illegal reflective access by jenkins.slaves.StandardOutputSwapper$ChannelSwapper to constructor java.io.FileDescriptor(int) WARNING: Please consider reporting this to the maintainers of jenkins.slaves.StandardOutputSwapper$ChannelSwapper WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations WARNING: All illegal access operations will be denied in a future release Evacuated stdout Jun 01, 2022 11:15:54 AM hudson.slaves.ChannelPinger$1 onDead INFO: Ping failed. Terminating the channel channel. java.util.concurrent.TimeoutException: Ping started at 1654081914259 hasn't completed by 1654082154266 at hudson.remoting.PingThread.ping(PingThread.java:132) at hudson.remoting.PingThread.run(PingThread.java:88)

          niv keidan added a comment -

          Also, from Jenkins system log:

          WARNING c.c.j.s.i.AboutJenkins$NodesContent#printTo: Could not get agent.jar version for AnkaOB-ephemeral-macos-12.2-xcode13.3-special-test-fAuJ3
          java.util.concurrent.TimeoutException
          at java.base/java.util.concurrent.FutureTask.get(FutureTask.java:204)
          at com.cloudbees.jenkins.support.util.CallAsyncWrapper.callAsync(CallAsyncWrapper.java:24)
          Caused: java.io.IOException
          at com.cloudbees.jenkins.support.util.CallAsyncWrapper.callAsync(CallAsyncWrapper.java:29)
          at com.cloudbees.jenkins.support.AsyncResultCache.get(AsyncResultCache.java:59)
          at com.cloudbees.jenkins.support.AsyncResultCache.get(AsyncResultCache.java:33)
          at com.cloudbees.jenkins.support.impl.AboutJenkins$NodesContent.printTo(AboutJenkins.java:679)
          at com.cloudbees.jenkins.support.api.PrefilteredPrintedContent.writeTo(PrefilteredPrintedContent.java:63)
          at com.cloudbees.jenkins.support.api.PrefilteredPrintedContent.writeTo(PrefilteredPrintedContent.java:56)
          at com.cloudbees.jenkins.support.SupportPlugin.writeBundle(SupportPlugin.java:377)
          at com.cloudbees.jenkins.support.SupportPlugin.writeBundle(SupportPlugin.java:316)
          at com.cloudbees.jenkins.support.SupportAction.prepareBundle(SupportAction.java:357)
          at com.cloudbees.jenkins.support.SupportAction.doGenerateAllBundles(SupportAction.java:307)
          at java.base/java.lang.invoke.MethodHandle.invokeWithArguments(MethodHandle.java:710)
          at org.kohsuke.stapler.Function$MethodFunction.invoke(Function.java:398)
          at org.kohsuke.stapler.Function$InstanceFunction.invoke(Function.java:410)
          at org.kohsuke.stapler.interceptor.RequirePOST$Processor.invoke(RequirePOST.java:78)
          at org.kohsuke.stapler.PreInvokeInterceptedFunction.invoke(PreInvokeInterceptedFunction.java:26)
          at org.kohsuke.stapler.Function.bindAndInvoke(Function.java:208)
          at org.kohsuke.stapler.Function.bindAndInvokeAndServeResponse(Function.java:141)
          at org.kohsuke.stapler.MetaClass$11.doDispatch(MetaClass.java:558)
          at org.kohsuke.stapler.NameBasedDispatcher.dispatch(NameBasedDispatcher.java:59)
          at org.kohsuke.stapler.Stapler.tryInvoke(Stapler.java:766)
          at org.kohsuke.stapler.Stapler.invoke(Stapler.java:898)
          at org.kohsuke.stapler.MetaClass$9.dispatch(MetaClass.java:475)
          at org.kohsuke.stapler.Stapler.tryInvoke(Stapler.java:766)
          at org.kohsuke.stapler.Stapler.invoke(Stapler.java:898)
          at org.kohsuke.stapler.Stapler.invoke(Stapler.java:694)
          at org.kohsuke.stapler.Stapler.service(Stapler.java:240)
          at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
          at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:799)
          at org.eclipse.jetty.servlet.ServletHandler$ChainEnd.doFilter(ServletHandler.java:1626)
          at hudson.util.PluginServletFilter$1.doFilter(PluginServletFilter.java:157)
          at jenkins.security.ResourceDomainFilter.doFilter(ResourceDomainFilter.java:81)
          at hudson.util.PluginServletFilter$1.doFilter(PluginServletFilter.java:154)
          at jenkins.telemetry.impl.UserLanguages$AcceptLanguageFilter.doFilter(UserLanguages.java:129)
          at hudson.util.PluginServletFilter$1.doFilter(PluginServletFilter.java:154)
          at com.cloudbees.jenkins.support.slowrequest.SlowRequestFilter.doFilter(SlowRequestFilter.java:37)
          at hudson.util.PluginServletFilter$1.doFilter(PluginServletFilter.java:154)
          at hudson.plugins.greenballs.GreenBallFilter.doFilter(GreenBallFilter.java:59)
          at hudson.util.PluginServletFilter$1.doFilter(PluginServletFilter.java:154)
          at net.bull.javamelody.MonitoringFilter.doFilter(MonitoringFilter.java:239)
          at net.bull.javamelody.MonitoringFilter.doFilter(MonitoringFilter.java:215)
          at net.bull.javamelody.PluginMonitoringFilter.doFilter(PluginMonitoringFilter.java:88)
          at org.jvnet.hudson.plugins.monitoring.HudsonMonitoringFilter.doFilter(HudsonMonitoringFilter.java:114)
          at hudson.util.PluginServletFilter$1.doFilter(PluginServletFilter.java:154)
          at jenkins.metrics.impl.MetricsFilter.doFilter(MetricsFilter.java:125)
          at hudson.util.PluginServletFilter$1.doFilter(PluginServletFilter.java:154)
          at hudson.util.PluginServletFilter.doFilter(PluginServletFilter.java:160)
          at org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:193)
          at org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601)
          at hudson.security.csrf.CrumbFilter.doFilter(CrumbFilter.java:154)
          at org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:193)
          at org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601)
          at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:94)
          at jenkins.security.AcegiSecurityExceptionFilter.doFilter(AcegiSecurityExceptionFilter.java:52)
          at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:99)
          at hudson.security.UnwrapSecurityExceptionFilter.doFilter(UnwrapSecurityExceptionFilter.java:54)
          at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:99)
          at org.springframework.security.web.access.ExceptionTranslationFilter.doFilter(ExceptionTranslationFilter.java:122)
          at org.springframework.security.web.access.ExceptionTranslationFilter.doFilter(ExceptionTranslationFilter.java:116)
          at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:99)
          at org.springframework.security.web.authentication.AnonymousAuthenticationFilter.doFilter(AnonymousAuthenticationFilter.java:109)
          at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:99)
          at org.springframework.security.web.authentication.rememberme.RememberMeAuthenticationFilter.doFilter(RememberMeAuthenticationFilter.java:102)
          at org.springframework.security.web.authentication.rememberme.RememberMeAuthenticationFilter.doFilter(RememberMeAuthenticationFilter.java:93)
          at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:99)
          at org.springframework.security.web.authentication.AbstractAuthenticationProcessingFilter.doFilter(AbstractAuthenticationProcessingFilter.java:219)
          at org.springframework.security.web.authentication.AbstractAuthenticationProcessingFilter.doFilter(AbstractAuthenticationProcessingFilter.java:213)
          at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:99)
          at jenkins.security.BasicHeaderProcessor.doFilter(BasicHeaderProcessor.java:97)
          at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:99)
          at org.springframework.security.web.context.SecurityContextPersistenceFilter.doFilter(SecurityContextPersistenceFilter.java:110)
          at org.springframework.security.web.context.SecurityContextPersistenceFilter.doFilter(SecurityContextPersistenceFilter.java:80)
          at hudson.security.HttpSessionContextIntegrationFilter2.doFilter(HttpSessionContextIntegrationFilter2.java:63)
          at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:99)
          at hudson.security.ChainedServletFilter.doFilter(ChainedServletFilter.java:111)
          at hudson.security.HudsonFilter.doFilter(HudsonFilter.java:172)
          at org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:193)
          at org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601)
          at org.kohsuke.stapler.compression.CompressionFilter.doFilter(CompressionFilter.java:53)
          at org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:193)
          at org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601)
          at hudson.util.CharacterEncodingFilter.doFilter(CharacterEncodingFilter.java:86)
          at org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:193)
          at org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601)
          at org.kohsuke.stapler.DiagnosticThreadNameFilter.doFilter(DiagnosticThreadNameFilter.java:30)
          at org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:193)
          at org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601)
          at jenkins.security.SuspiciousRequestFilter.doFilter(SuspiciousRequestFilter.java:38)
          at org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:193)
          at org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601)
          at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:548)
          at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
          at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:578)
          at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
          at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235)
          at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1624)
          at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233)
          at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1434)
          at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188)
          at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:501)
          at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1594)
          at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:186)
          at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1349)
          at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
          at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
          at org.eclipse.jetty.server.Server.handle(Server.java:516)
          at org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:388)
          at org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:633)
          at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:380)
          at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:277)
          at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311)
          at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:105)
          at org.eclipse.jetty.io.ChannelEndPoint$1.run(ChannelEndPoint.java:104)
          at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:338)
          at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:315)
          at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:173)
          at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:131)
          at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:386)
          at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:883)
          at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1034)
          at java.base/java.lang.Thread.run(Thread.java:829)

          niv keidan added a comment - Also, from Jenkins system log: WARNING c.c.j.s.i.AboutJenkins$NodesContent#printTo: Could not get agent.jar version for AnkaOB-ephemeral-macos-12.2-xcode13.3-special-test-fAuJ3 java.util.concurrent.TimeoutException at java.base/java.util.concurrent.FutureTask.get(FutureTask.java:204) at com.cloudbees.jenkins.support.util.CallAsyncWrapper.callAsync(CallAsyncWrapper.java:24) Caused: java.io.IOException at com.cloudbees.jenkins.support.util.CallAsyncWrapper.callAsync(CallAsyncWrapper.java:29) at com.cloudbees.jenkins.support.AsyncResultCache.get(AsyncResultCache.java:59) at com.cloudbees.jenkins.support.AsyncResultCache.get(AsyncResultCache.java:33) at com.cloudbees.jenkins.support.impl.AboutJenkins$NodesContent.printTo(AboutJenkins.java:679) at com.cloudbees.jenkins.support.api.PrefilteredPrintedContent.writeTo(PrefilteredPrintedContent.java:63) at com.cloudbees.jenkins.support.api.PrefilteredPrintedContent.writeTo(PrefilteredPrintedContent.java:56) at com.cloudbees.jenkins.support.SupportPlugin.writeBundle(SupportPlugin.java:377) at com.cloudbees.jenkins.support.SupportPlugin.writeBundle(SupportPlugin.java:316) at com.cloudbees.jenkins.support.SupportAction.prepareBundle(SupportAction.java:357) at com.cloudbees.jenkins.support.SupportAction.doGenerateAllBundles(SupportAction.java:307) at java.base/java.lang.invoke.MethodHandle.invokeWithArguments(MethodHandle.java:710) at org.kohsuke.stapler.Function$MethodFunction.invoke(Function.java:398) at org.kohsuke.stapler.Function$InstanceFunction.invoke(Function.java:410) at org.kohsuke.stapler.interceptor.RequirePOST$Processor.invoke(RequirePOST.java:78) at org.kohsuke.stapler.PreInvokeInterceptedFunction.invoke(PreInvokeInterceptedFunction.java:26) at org.kohsuke.stapler.Function.bindAndInvoke(Function.java:208) at org.kohsuke.stapler.Function.bindAndInvokeAndServeResponse(Function.java:141) at org.kohsuke.stapler.MetaClass$11.doDispatch(MetaClass.java:558) at org.kohsuke.stapler.NameBasedDispatcher.dispatch(NameBasedDispatcher.java:59) at org.kohsuke.stapler.Stapler.tryInvoke(Stapler.java:766) at org.kohsuke.stapler.Stapler.invoke(Stapler.java:898) at org.kohsuke.stapler.MetaClass$9.dispatch(MetaClass.java:475) at org.kohsuke.stapler.Stapler.tryInvoke(Stapler.java:766) at org.kohsuke.stapler.Stapler.invoke(Stapler.java:898) at org.kohsuke.stapler.Stapler.invoke(Stapler.java:694) at org.kohsuke.stapler.Stapler.service(Stapler.java:240) at javax.servlet.http.HttpServlet.service(HttpServlet.java:790) at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:799) at org.eclipse.jetty.servlet.ServletHandler$ChainEnd.doFilter(ServletHandler.java:1626) at hudson.util.PluginServletFilter$1.doFilter(PluginServletFilter.java:157) at jenkins.security.ResourceDomainFilter.doFilter(ResourceDomainFilter.java:81) at hudson.util.PluginServletFilter$1.doFilter(PluginServletFilter.java:154) at jenkins.telemetry.impl.UserLanguages$AcceptLanguageFilter.doFilter(UserLanguages.java:129) at hudson.util.PluginServletFilter$1.doFilter(PluginServletFilter.java:154) at com.cloudbees.jenkins.support.slowrequest.SlowRequestFilter.doFilter(SlowRequestFilter.java:37) at hudson.util.PluginServletFilter$1.doFilter(PluginServletFilter.java:154) at hudson.plugins.greenballs.GreenBallFilter.doFilter(GreenBallFilter.java:59) at hudson.util.PluginServletFilter$1.doFilter(PluginServletFilter.java:154) at net.bull.javamelody.MonitoringFilter.doFilter(MonitoringFilter.java:239) at net.bull.javamelody.MonitoringFilter.doFilter(MonitoringFilter.java:215) at net.bull.javamelody.PluginMonitoringFilter.doFilter(PluginMonitoringFilter.java:88) at org.jvnet.hudson.plugins.monitoring.HudsonMonitoringFilter.doFilter(HudsonMonitoringFilter.java:114) at hudson.util.PluginServletFilter$1.doFilter(PluginServletFilter.java:154) at jenkins.metrics.impl.MetricsFilter.doFilter(MetricsFilter.java:125) at hudson.util.PluginServletFilter$1.doFilter(PluginServletFilter.java:154) at hudson.util.PluginServletFilter.doFilter(PluginServletFilter.java:160) at org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:193) at org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601) at hudson.security.csrf.CrumbFilter.doFilter(CrumbFilter.java:154) at org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:193) at org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601) at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:94) at jenkins.security.AcegiSecurityExceptionFilter.doFilter(AcegiSecurityExceptionFilter.java:52) at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:99) at hudson.security.UnwrapSecurityExceptionFilter.doFilter(UnwrapSecurityExceptionFilter.java:54) at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:99) at org.springframework.security.web.access.ExceptionTranslationFilter.doFilter(ExceptionTranslationFilter.java:122) at org.springframework.security.web.access.ExceptionTranslationFilter.doFilter(ExceptionTranslationFilter.java:116) at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:99) at org.springframework.security.web.authentication.AnonymousAuthenticationFilter.doFilter(AnonymousAuthenticationFilter.java:109) at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:99) at org.springframework.security.web.authentication.rememberme.RememberMeAuthenticationFilter.doFilter(RememberMeAuthenticationFilter.java:102) at org.springframework.security.web.authentication.rememberme.RememberMeAuthenticationFilter.doFilter(RememberMeAuthenticationFilter.java:93) at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:99) at org.springframework.security.web.authentication.AbstractAuthenticationProcessingFilter.doFilter(AbstractAuthenticationProcessingFilter.java:219) at org.springframework.security.web.authentication.AbstractAuthenticationProcessingFilter.doFilter(AbstractAuthenticationProcessingFilter.java:213) at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:99) at jenkins.security.BasicHeaderProcessor.doFilter(BasicHeaderProcessor.java:97) at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:99) at org.springframework.security.web.context.SecurityContextPersistenceFilter.doFilter(SecurityContextPersistenceFilter.java:110) at org.springframework.security.web.context.SecurityContextPersistenceFilter.doFilter(SecurityContextPersistenceFilter.java:80) at hudson.security.HttpSessionContextIntegrationFilter2.doFilter(HttpSessionContextIntegrationFilter2.java:63) at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:99) at hudson.security.ChainedServletFilter.doFilter(ChainedServletFilter.java:111) at hudson.security.HudsonFilter.doFilter(HudsonFilter.java:172) at org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:193) at org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601) at org.kohsuke.stapler.compression.CompressionFilter.doFilter(CompressionFilter.java:53) at org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:193) at org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601) at hudson.util.CharacterEncodingFilter.doFilter(CharacterEncodingFilter.java:86) at org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:193) at org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601) at org.kohsuke.stapler.DiagnosticThreadNameFilter.doFilter(DiagnosticThreadNameFilter.java:30) at org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:193) at org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601) at jenkins.security.SuspiciousRequestFilter.doFilter(SuspiciousRequestFilter.java:38) at org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:193) at org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:548) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143) at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:578) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127) at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1624) at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1434) at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:501) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1594) at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:186) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1349) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127) at org.eclipse.jetty.server.Server.handle(Server.java:516) at org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:388) at org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:633) at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:380) at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:277) at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311) at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:105) at org.eclipse.jetty.io.ChannelEndPoint$1.run(ChannelEndPoint.java:104) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:338) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:315) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:173) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:131) at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:386) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:883) at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1034) at java.base/java.lang.Thread.run(Thread.java:829)

          niv keidan added a comment -

          Similiar stack traces also exist for:

          • WARNING c.c.j.s.i.EnvironmentVariables$2#printTo: Could not record environment of node ...
          • WARNING c.c.j.s.i.AboutJenkins$NodesContent#printTo: Could not get agent.jar version for...
          • WARNING c.c.j.s.i.AboutJenkins$NodesContent#printTo: Could not get Java info for...
          • WARNING c.c.j.s.i.AboutJenkins$NodeChecksumsContent#printTo: Could not compute checksums on agent ...

          niv keidan added a comment - Similiar stack traces also exist for: WARNING c.c.j.s.i.EnvironmentVariables$2#printTo: Could not record environment of node ... WARNING c.c.j.s.i.AboutJenkins$NodesContent#printTo: Could not get agent.jar version for... WARNING c.c.j.s.i.AboutJenkins$NodesContent#printTo: Could not get Java info for... WARNING c.c.j.s.i.AboutJenkins$NodeChecksumsContent#printTo: Could not compute checksums on agent ...

          >We just found out that executing sudo kill -9 <pid> for SSHD process for that specific connection on a VM, will result in channel failure Jenkins will recognize that channel is broken and clean everything up.

          So the agent is waiting for something and if you kill the SSHD service everything is correctly cleanup. The timeout for connections should make the same thing (210 seconds) you can customize that timeout.

          I see you are using Temurin JDK 11 on the Agents and also that are macOS, Which JDK do you use on the Jenkins controller? Do you see some correlation between JDK versions or OS versions on the agents that fail to start?

          I think is not related to the Jenkins plugins, looks like a JDK versions/flavor or OS versions issue.

          Ivan Fernandez Calvo added a comment - >We just found out that executing sudo kill -9 <pid> for SSHD process for that specific connection on a VM, will result in channel failure Jenkins will recognize that channel is broken and clean everything up. So the agent is waiting for something and if you kill the SSHD service everything is correctly cleanup. The timeout for connections should make the same thing (210 seconds) you can customize that timeout. I see you are using Temurin JDK 11 on the Agents and also that are macOS, Which JDK do you use on the Jenkins controller? Do you see some correlation between JDK versions or OS versions on the agents that fail to start? I think is not related to the Jenkins plugins, looks like a JDK versions/flavor or OS versions issue.

          niv keidan added a comment -

          master is on Ubuntu 20.04, using openjdk 11.0.15

          We are seeing errors for the agent being non response for 12+ minutes, so the timeout mechanism is failing somewhere :/

          niv keidan added a comment - master is on Ubuntu 20.04, using openjdk 11.0.15 We are seeing errors for the agent being non response for 12+ minutes, so the timeout mechanism is failing somewhere :/
          niv keidan made changes -
          Environment Original: Jenkins 2.332.3
          SSH Slaves Plugin 1.814.vc82988f54b_10 (tested with 1.33.0 as well)
          Anka Build Plugin 2.7.0
          New: Jenkins 2.332.3, OpenJDK 11.0.15, running on Ubuntu 20.04
          SSH Slaves Plugin 1.814.vc82988f54b_10 (tested with 1.33.0 as well)
          Anka Build Plugin 2.7.0

          Do all the agents stuck when they end the connection or before? this message means the agent is connected and the channel.

          <===[JENKINS REMOTING CAPACITY]===>channel started
          Remoting version: 4.13
          This is a Unix agent
          

          Ivan Fernandez Calvo added a comment - Do all the agents stuck when they end the connection or before? this message means the agent is connected and the channel. <===[JENKINS REMOTING CAPACITY]===>channel started Remoting version: 4.13 This is a Unix agent

            ifernandezcalvo Ivan Fernandez Calvo
            niv_keidan_veertu niv keidan
            Votes:
            1 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: