Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-37812

No notification re. failed job if agent goes offline during the build

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Major Major
    • None

      A job running on an Agent node failed with the following error:

      Agent went offline during the build
      Build step 'Execute shell' marked build as failure
      ERROR: Step ‘E-mail Notification’ failed: no workspace for prod-utilization-triggers #5123

      As the error implies, Jenkins did not send notification about this failure.

          [JENKINS-37812] No notification re. failed job if agent goes offline during the build

          Oleg Nenashev added a comment -

          Without a full stacktrace I cannot say where is the issue. But it is in one of email plugins, not in Jenkins core

          Oleg Nenashev added a comment - Without a full stacktrace I cannot say where is the issue. But it is in one of email plugins, not in Jenkins core

          Chris Wilson added a comment -

          This is happening to us too. Here is a stack trace.

          [Sorry if this is badly formatted, I tried hard to make the comment editor treat it as preformatted/monospace but it looks like I am unable to.]

          {{
          FATAL: command execution failed
          java.io.IOException
          at hudson.remoting.Channel.close(Channel.java:1402)
          at hudson.remoting.Channel.close(Channel.java:1358)
          at hudson.slaves.SlaveComputer.closeChannel(SlaveComputer.java:745)
          at hudson.slaves.SlaveComputer.kill(SlaveComputer.java:712)
          at hudson.model.AbstractCIBase.killComputer(AbstractCIBase.java:88)
          at jenkins.model.Jenkins.access$2000(Jenkins.java:302)
          at jenkins.model.Jenkins$20.run(Jenkins.java:3328)
          at hudson.model.Queue._withLock(Queue.java:1370)
          at hudson.model.Queue.withLock(Queue.java:1247)
          at jenkins.model.Jenkins._cleanUpDisconnectComputers(Jenkins.java:3322)
          at jenkins.model.Jenkins.cleanUp(Jenkins.java:3197)
          at hudson.WebAppMain.contextDestroyed(WebAppMain.java:379)
          at org.eclipse.jetty.server.handler.ContextHandler.callContextDestroyed(ContextHandler.java:898)
          at org.eclipse.jetty.servlet.ServletContextHandler.callContextDestroyed(ServletContextHandler.java:545)
          at org.eclipse.jetty.server.handler.ContextHandler.stopContext(ContextHandler.java:873)
          at org.eclipse.jetty.servlet.ServletContextHandler.stopContext(ServletContextHandler.java:355)
          at org.eclipse.jetty.webapp.WebAppContext.stopWebapp(WebAppContext.java:1507)
          at org.eclipse.jetty.webapp.WebAppContext.stopContext(WebAppContext.java:1471)
          at org.eclipse.jetty.server.handler.ContextHandler.doStop(ContextHandler.java:927)
          at org.eclipse.jetty.servlet.ServletContextHandler.doStop(ServletContextHandler.java:271)
          at org.eclipse.jetty.webapp.WebAppContext.doStop(WebAppContext.java:569)
          at org.eclipse.jetty.util.component.AbstractLifeCycle.stop(AbstractLifeCycle.java:89)
          at org.eclipse.jetty.util.component.ContainerLifeCycle.stop(ContainerLifeCycle.java:142)
          at org.eclipse.jetty.util.component.ContainerLifeCycle.doStop(ContainerLifeCycle.java:160)
          at org.eclipse.jetty.server.handler.AbstractHandler.doStop(AbstractHandler.java:124)
          at org.eclipse.jetty.server.Server.doStop(Server.java:523)
          at org.eclipse.jetty.util.component.AbstractLifeCycle.stop(AbstractLifeCycle.java:89)
          at winstone.Launcher.shutdown(Launcher.java:307)
          at winstone.ShutdownHook.run(ShutdownHook.java:25)
          Caused: hudson.remoting.ChannelClosedException: Channel "unknown": Remote call on xxx-xxxx failed. The channel is closing down or has closed down
          at hudson.remoting.Channel.call(Channel.java:902)
          at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:281)
          at com.sun.proxy.$Proxy57.isAlive(Unknown Source)
          at hudson.Launcher$RemoteLauncher$ProcImpl.isAlive(Launcher.java:1137)
          at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:1129)
          at hudson.tasks.CommandInterpreter.join(CommandInterpreter.java:155)
          at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:109)
          at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:66)
          at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
          at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:744)
          at hudson.model.Build$BuildExecution.build(Build.java:206)
          at hudson.model.Build$BuildExecution.doRun(Build.java:163)
          at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:504)
          at hudson.model.Run.execute(Run.java:1727)
          at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
          at hudson.model.ResourceController.execute(ResourceController.java:97)
          at hudson.model.Executor.run(Executor.java:429)
          Build step 'Execute shell' marked build as failure
          ERROR: Step ‘E-mail Notification’ failed: no workspace for xxx_xxx_xxx #101
          Notifying upstream projects of job completion
          Finished: FAILURE
          }}

          Chris Wilson added a comment - This is happening to us too. Here is a stack trace. [Sorry if this is badly formatted, I tried hard to make the comment editor treat it as preformatted/monospace but it looks like I am unable to.] {{ FATAL: command execution failed java.io.IOException at hudson.remoting.Channel.close(Channel.java:1402) at hudson.remoting.Channel.close(Channel.java:1358) at hudson.slaves.SlaveComputer.closeChannel(SlaveComputer.java:745) at hudson.slaves.SlaveComputer.kill(SlaveComputer.java:712) at hudson.model.AbstractCIBase.killComputer(AbstractCIBase.java:88) at jenkins.model.Jenkins.access$2000(Jenkins.java:302) at jenkins.model.Jenkins$20.run(Jenkins.java:3328) at hudson.model.Queue._withLock(Queue.java:1370) at hudson.model.Queue.withLock(Queue.java:1247) at jenkins.model.Jenkins._cleanUpDisconnectComputers(Jenkins.java:3322) at jenkins.model.Jenkins.cleanUp(Jenkins.java:3197) at hudson.WebAppMain.contextDestroyed(WebAppMain.java:379) at org.eclipse.jetty.server.handler.ContextHandler.callContextDestroyed(ContextHandler.java:898) at org.eclipse.jetty.servlet.ServletContextHandler.callContextDestroyed(ServletContextHandler.java:545) at org.eclipse.jetty.server.handler.ContextHandler.stopContext(ContextHandler.java:873) at org.eclipse.jetty.servlet.ServletContextHandler.stopContext(ServletContextHandler.java:355) at org.eclipse.jetty.webapp.WebAppContext.stopWebapp(WebAppContext.java:1507) at org.eclipse.jetty.webapp.WebAppContext.stopContext(WebAppContext.java:1471) at org.eclipse.jetty.server.handler.ContextHandler.doStop(ContextHandler.java:927) at org.eclipse.jetty.servlet.ServletContextHandler.doStop(ServletContextHandler.java:271) at org.eclipse.jetty.webapp.WebAppContext.doStop(WebAppContext.java:569) at org.eclipse.jetty.util.component.AbstractLifeCycle.stop(AbstractLifeCycle.java:89) at org.eclipse.jetty.util.component.ContainerLifeCycle.stop(ContainerLifeCycle.java:142) at org.eclipse.jetty.util.component.ContainerLifeCycle.doStop(ContainerLifeCycle.java:160) at org.eclipse.jetty.server.handler.AbstractHandler.doStop(AbstractHandler.java:124) at org.eclipse.jetty.server.Server.doStop(Server.java:523) at org.eclipse.jetty.util.component.AbstractLifeCycle.stop(AbstractLifeCycle.java:89) at winstone.Launcher.shutdown(Launcher.java:307) at winstone.ShutdownHook.run(ShutdownHook.java:25) Caused: hudson.remoting.ChannelClosedException: Channel "unknown": Remote call on xxx-xxxx failed. The channel is closing down or has closed down at hudson.remoting.Channel.call(Channel.java:902) at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:281) at com.sun.proxy.$Proxy57.isAlive(Unknown Source) at hudson.Launcher$RemoteLauncher$ProcImpl.isAlive(Launcher.java:1137) at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:1129) at hudson.tasks.CommandInterpreter.join(CommandInterpreter.java:155) at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:109) at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:66) at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20) at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:744) at hudson.model.Build$BuildExecution.build(Build.java:206) at hudson.model.Build$BuildExecution.doRun(Build.java:163) at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:504) at hudson.model.Run.execute(Run.java:1727) at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43) at hudson.model.ResourceController.execute(ResourceController.java:97) at hudson.model.Executor.run(Executor.java:429) Build step 'Execute shell' marked build as failure ERROR: Step ‘E-mail Notification’ failed: no workspace for xxx_xxx_xxx #101 Notifying upstream projects of job completion Finished: FAILURE }}

          George Sakhnovsky added a comment - - edited

          If it's helpful to anyone:

           

          As a workaround, we installed the Mail Watcher Plugin and configured it to send e-mail alerts on node online status changes. This gave us some visibility into the scenario of jobs failing quietly due to agent issues.

          George Sakhnovsky added a comment - - edited If it's helpful to anyone:   As a workaround, we installed the Mail Watcher Plugin and configured it to send e-mail alerts on node online status changes. This gave us some visibility into the scenario of jobs failing quietly due to agent issues.

          PR merged

          Francisco Fernández added a comment - PR merged

            rbernier Roger Bernier
            gsakhnovsky George Sakhnovsky
            Votes:
            1 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated:
              Resolved: