Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-34712

"master is offline" preventing Pipeline from executing

    XMLWordPrintable

Details

    Description

      Our own Jenkins Pipeline projects seem to be getting stuck in this state of "master is offline" when attempting to run on our clusters which have zero executors assigned to the master node.

      It's unclear what, past a service restart, will clear this up

      Steps to reproduce:

      1. Start a pipeline job
      2. forcing the master to run out of storage
      3. shutdown master, clear up storage
      4. restart master, confirm it's up
      5. observe that it is still marked as offline for a long time. 30+ minutes

      Attachments

        Issue Links

          Activity

            rtyler R. Tyler Croy added a comment -

            Correction, a restart has not corrected the issue. The Pipeline is stuck again in the build queue

            rtyler R. Tyler Croy added a comment - Correction, a restart has not corrected the issue. The Pipeline is stuck again in the build queue
            abayer Andrew Bayer added a comment -

            A few questions -

            • What versions of core and Pipeline plugins are running?
            • Is the jenkins.io job "running"? That is, the job itself blinking etc - if it is, then it's stuck on executing part of itself, but if it isn't, then it's stuck even before that.
            abayer Andrew Bayer added a comment - A few questions - What versions of core and Pipeline plugins are running? Is the jenkins.io job "running"? That is, the job itself blinking etc - if it is, then it's stuck on executing part of itself, but if it isn't, then it's stuck even before that.
            abayer Andrew Bayer added a comment -

            At first glance, I can't see how it'd ever have "master is offline" as a blocked reason (https://github.com/jenkinsci/workflow-job-plugin/blob/master/src/main/java/org/jenkinsci/plugins/workflow/job/WorkflowJob.java#L314) but I may be missing something. jglick, any thoughts?

            abayer Andrew Bayer added a comment - At first glance, I can't see how it'd ever have "master is offline" as a blocked reason ( https://github.com/jenkinsci/workflow-job-plugin/blob/master/src/main/java/org/jenkinsci/plugins/workflow/job/WorkflowJob.java#L314 ) but I may be missing something. jglick , any thoughts?
            danielbeck Daniel Beck added a comment -

            JENKINS-7291 should ensure master always has a computer.

            danielbeck Daniel Beck added a comment - JENKINS-7291 should ensure master always has a computer.
            rtyler R. Tyler Croy added a comment -

            abayer, the Environment section of this JIRA has the information answering question number one

            As for the second, this is a Multiibranch project. The "master" branch "job" is not blinking, and the "jenkins.io" folder is not blinking either, though I don't think it does that

            Both you and danielbeck have access to this instance, you can "see" it live, but as this is a managed host, please refrain from tinkering settings and whatnot.

            rtyler R. Tyler Croy added a comment - abayer , the Environment section of this JIRA has the information answering question number one As for the second, this is a Multiibranch project. The "master" branch "job" is not blinking, and the "jenkins.io" folder is not blinking either, though I don't think it does that Both you and danielbeck have access to this instance, you can "see" it live, but as this is a managed host, please refrain from tinkering settings and whatnot.
            jglick Jesse Glick added a comment -

            I have never heard of this problem before, and have no idea offhand how it could occur, since as danielbeck notes, there is always a MasterComputer even if you have configured zero heavyweight executors—WorkflowJob uses flyweights.

            As far as I know I lack administrative access to the server in question to do any live debugging.

            jglick Jesse Glick added a comment - I have never heard of this problem before, and have no idea offhand how it could occur, since as danielbeck notes, there is always a MasterComputer even if you have configured zero heavyweight executors— WorkflowJob uses flyweights. As far as I know I lack administrative access to the server in question to do any live debugging.
            jglick Jesse Glick added a comment -

            Jenkins.instance.selfLabel.offline, which should never be possible.

            jglick Jesse Glick added a comment - Jenkins.instance.selfLabel.offline , which should never be possible.
            danielbeck Daniel Beck added a comment -

            jglick We learned a few hours ago that master was marked offline due to disk space, and since it has zero executors, it wasn't apparent from the UI (as an executor-less master isn't shown on the executors pane).

            For some reason that offline state was preserved across restarts, and apparently longer then disk space cleanup + 30 minutes for the next monitor run, so maybe something was wrong there, but that was the offline cause.

            danielbeck Daniel Beck added a comment - jglick We learned a few hours ago that master was marked offline due to disk space, and since it has zero executors, it wasn't apparent from the UI (as an executor-less master isn't shown on the executors pane). For some reason that offline state was preserved across restarts, and apparently longer then disk space cleanup + 30 minutes for the next monitor run, so maybe something was wrong there, but that was the offline cause.
            danielbeck Daniel Beck added a comment -

            Looks a lot like Not A Defect to me. If the master is offline (especially for disk space reasons), no need to run any builds anywhere. The only RFE I could think of would be to not hide the executor-less master node in the executors sidepanel if it's marked offline.

            danielbeck Daniel Beck added a comment - Looks a lot like Not A Defect to me. If the master is offline (especially for disk space reasons), no need to run any builds anywhere. The only RFE I could think of would be to not hide the executor-less master node in the executors sidepanel if it's marked offline.
            jglick Jesse Glick added a comment -

            Sounds like a core bug.

            jglick Jesse Glick added a comment - Sounds like a core bug.
            danielbeck Daniel Beck added a comment -

            jglick What's the bug? That the node monitors work? That flyweight tasks don't run on marked-offline nodes?

            danielbeck Daniel Beck added a comment - jglick What's the bug? That the node monitors work? That flyweight tasks don't run on marked-offline nodes?
            jglick Jesse Glick added a comment -

            I guess that the master node should be displayed when it is offline.

            jglick Jesse Glick added a comment - I guess that the master node should be displayed when it is offline.

            Code changed in jenkins
            User: Oleg Nenashev
            Path:
            core/src/main/resources/lib/hudson/executors.jelly
            http://jenkins-ci.org/commit/jenkins/b67a30f8daff936c91fd54b90bef6c366707a8f1
            Log:
            Merge pull request #3294 from dwnusbaum/JENKINS-34712

            JENKINS-34712 Always show the master node when it is offline

            Compare: https://github.com/jenkinsci/jenkins/compare/5c8cc45900bf...b67a30f8daff

            scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Oleg Nenashev Path: core/src/main/resources/lib/hudson/executors.jelly http://jenkins-ci.org/commit/jenkins/b67a30f8daff936c91fd54b90bef6c366707a8f1 Log: Merge pull request #3294 from dwnusbaum/ JENKINS-34712 JENKINS-34712 Always show the master node when it is offline Compare: https://github.com/jenkinsci/jenkins/compare/5c8cc45900bf...b67a30f8daff
            danielbeck Daniel Beck added a comment -

            Released in 2.108.

            danielbeck Daniel Beck added a comment - Released in 2.108.

            Code changed in jenkins
            User: Oleg Nenashev
            Path:
            core/src/main/resources/lib/hudson/executors.jelly
            http://jenkins-ci.org/commit/jenkins/20d44c5aa750f6fece96f83f0f7ed519e9df2e54
            Log:
            Merge pull request #3294 from dwnusbaum/JENKINS-34712

            JENKINS-34712 Always show the master node when it is offline

            (cherry picked from commit b67a30f8daff936c91fd54b90bef6c366707a8f1)

            scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Oleg Nenashev Path: core/src/main/resources/lib/hudson/executors.jelly http://jenkins-ci.org/commit/jenkins/20d44c5aa750f6fece96f83f0f7ed519e9df2e54 Log: Merge pull request #3294 from dwnusbaum/ JENKINS-34712 JENKINS-34712 Always show the master node when it is offline (cherry picked from commit b67a30f8daff936c91fd54b90bef6c366707a8f1)

            People

              dnusbaum Devin Nusbaum
              rtyler R. Tyler Croy
              Votes:
              1 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: