Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-42043

A NPE in SlaveComputer.setNode shouldn't prevent Jenkins from starting

      Currently, an exception thrown in SlaveComputer.setNode() prevents Jenkins from starting (JENKINS-38389).

      It would be nice to catch RuntimeException there for robustness, as advised by jglick there.

          [JENKINS-42043] A NPE in SlaveComputer.setNode shouldn't prevent Jenkins from starting

          Baptiste Mathus created issue -
          Jesse Glick made changes -
          Link New: This issue relates to JENKINS-38389 [ JENKINS-38389 ]

          rsandell added a comment -

          But what should happen if the operation fails?
          The stacktrace shows that Jenkins is currently updating all "computers" to see which ones to reuse/what executors to remove. If the node failed to update should it then be considered for reuse or for removal?
          The error in ec2 plugin suggests that it was a "harmless" mistake and we should probably keep it around, while I would guess in most cases since it is in error it should be removed from the list.

          rsandell added a comment - But what should happen if the operation fails? The stacktrace shows that Jenkins is currently updating all "computers" to see which ones to reuse/what executors to remove. If the node failed to update should it then be considered for reuse or for removal? The error in ec2 plugin suggests that it was a "harmless" mistake and we should probably keep it around, while I would guess in most cases since it is in error it should be removed from the list.

          My $0.02 - I think it should keep the node and act as if it is a disconnected node instead of removing the node (i.e. ignore it and move on with the startup). We should never be removing nodes from the list that fail to connect. We should only remove them if the node is manually removed by an administrator, or the node is removed by the plugin via a timeout.

          Steven Christou added a comment - My $0.02 - I think it should keep the node and act as if it is a disconnected node instead of removing the node (i.e. ignore it and move on with the startup). We should never be removing nodes from the list that fail to connect. We should only remove them if the node is manually removed by an administrator, or the node is removed by the plugin via a timeout.

          Andrew Bayer added a comment -

          That sounds right to me.

          Andrew Bayer added a comment - That sounds right to me.
          Oleg Nenashev made changes -
          Assignee New: Oleg Nenashev [ oleg_nenashev ]
          Oleg Nenashev made changes -
          Labels Original: robustness New: newbie-friendly robustness
          Oleg Nenashev made changes -
          Issue Type Original: Improvement [ 4 ] New: Bug [ 1 ]
          Andrew Bayer made changes -
          Assignee Original: Oleg Nenashev [ oleg_nenashev ] New: Andrew Bayer [ abayer ]
          Andrew Bayer made changes -
          Status Original: Open [ 1 ] New: In Progress [ 3 ]

            abayer Andrew Bayer
            batmat Baptiste Mathus
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated:
              Resolved: