Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-49610

The SCMSource.setOwner(owner) contract needs updating to include ensuring that an ID has been assigned

    • Icon: Improvement Improvement
    • Resolution: Unresolved
    • Icon: Major Major
    • scm-api-plugin
    • None

      Discussion

      We need to ensure that issues like JENKINS-48571 are more easily self-diagnosable by users.

      At first glance there are two ways this could be solved:

      1. We could enforce the id assignment by throwing an IllegalStateException or similar if the id is null at the time of setOwner(non-null)
      2. We could ensure an id assignment by assigning one if the id is null at the time of setOwner(non-null) 

      There may also be other potential solutions.

      Acceptance Criteria

      • Assessment criteria for selection of a proposed solution have been defined and reviewed by stephenconnolly and michaelneale
      • The list candidate solutions to be assessed has been defined
      • The results of the assessment process have been reviewed with stephenconnolly and michaelneale and the winner agreed.
      • The winning solution has been implemented.
      • The documentation has been updated to include the impact.
      • Minimization of the risk of "Build storms" has been included in the assessment criteria

      Critical assumpitions

      (if any of these prove to be broken in the process of resolving this ticket then a replan is required)

      • There is no good reason to call setId after the owner has been assigned.

          [JENKINS-49610] The SCMSource.setOwner(owner) contract needs updating to include ensuring that an ID has been assigned

          FTR my initial gut feeling is that throwing IllegalStateException is the better way to go as the UI code paths should always be pre-assigning an ID and this would catch users of e.g. JobDSL and present them with a meaningful exception that guides them to their solution. That would have the side-effect of breaking existing (IMHO already broken) systems... but the other approach would give those systems build storms which would be a much more subtle problem to notice

          Stephen Connolly added a comment - FTR my initial gut feeling is that throwing IllegalStateException is the better way to go as the UI code paths should always be pre-assigning an ID and this would catch users of e.g. JobDSL and present them with a meaningful exception that guides them to their solution. That would have the side-effect of breaking existing (IMHO already broken) systems... but the other approach would give those systems build storms which would be a much more subtle problem to notice

          Assessment criteria for selection of a proposed solution

          I'm not looking for anything complex or a heavyweight process... could be as simple as "I'm going to setup a test system with a broken job-dsl script that doesn't assign {{id}}s and see which causes rebuilds" or could be as complex as 10 pages of various metrics (I would hope you err on the side of simple though)

          Just let's get the criteria agreed, documented and record the results and decision.

          Stephen Connolly added a comment - Assessment criteria for selection of a proposed solution I'm not looking for anything complex or a heavyweight process... could be as simple as "I'm going to setup a test system with a broken job-dsl script that doesn't assign {{id}}s and see which causes rebuilds" or could be as complex as 10 pages of various metrics (I would hope you err on the side of simple though) Just let's get the criteria agreed, documented and record the results and decision.

          There may also be other potential solutions.

          e.g. your original PR could be a potential solution if you want to put that in the assessment.

          Feel free to just take my two proposed candidate solutions or to investigate and come up with additional proposed solutions... you need to assess at least two solutions though, and I'd like the "throwing IllegalStateException" to be included in the assessment

          Stephen Connolly added a comment - There may also be other potential solutions. e.g. your original PR could be a potential solution if you want to put that in the assessment. Feel free to just take my two proposed candidate solutions or to investigate and come up with additional proposed solutions... you need to assess at least two solutions though, and I'd like the "throwing IllegalStateException " to be included in the assessment

          Ok, after seeing how many users of JobDSL are being caught by this, I am inclined to thing we critically need to throw an exception in setOwner if id==null even if this risks breaking existing job-dsl scripts as anything else will risk rebuild storms after job-dsl overwrites the configuration which IMHO would help users fix the bug in their job-dsl scripts rather than mask it further down in the weeds.

          We cannot make this change cleanly, however, as this change will cause BlueOcean to blow up. I think a hack-fix may be required until BlueOcean is fixed... namely we create the ISE and walk the stack to see if BlueOcean is in the stack-trace... if we see BlueOcean (by String classname) then we Log the exception at SEVERE and set the id to blueocean otherwise we throw.

          That would be my suggestion, but we should assess any alternatives as the stack walking is certainly a bad code smell... even if a non-performance critical code path... and we need to assess what would happen if job-dsl blows up... does it delete the job or leave it alone?

          Stephen Connolly added a comment - Ok, after seeing how many users of JobDSL are being caught by this, I am inclined to thing we critically need to throw an exception in setOwner if id==null even if this risks breaking existing job-dsl scripts as anything else will risk rebuild storms after job-dsl overwrites the configuration which IMHO would help users fix the bug in their job-dsl scripts rather than mask it further down in the weeds. We cannot make this change cleanly, however, as this change will cause BlueOcean to blow up. I think a hack-fix may be required until BlueOcean is fixed... namely we create the ISE and walk the stack to see if BlueOcean is in the stack-trace... if we see BlueOcean (by String classname) then we Log the exception at SEVERE and set the id to blueocean otherwise we throw. That would be my suggestion, but we should assess any alternatives as the stack walking is certainly a bad code smell... even if a non-performance critical code path... and we need to assess what would happen if job-dsl blows up... does it delete the job or leave it alone?

          Michael Neale added a comment -

          Right, this is tricky. 

          Can I ask if this has been around, with jobDSL, and only recently being noted? A warning/error may be ok for jobDSL, but for blue ocean would want it to not suddenly blow up. 

           

          If a work around for blue users is to open and save the config in classic - I think that is ok, if that is the main impact (assuming it is fixed so NEW jobs get a reasonable non null id). Assuming I understood this. 

          Michael Neale added a comment - Right, this is tricky.  Can I ask if this has been around, with jobDSL, and only recently being noted? A warning/error may be ok for jobDSL, but for blue ocean would want it to not suddenly blow up.    If a work around for blue users is to open and save the config in classic - I think that is ok, if that is the main impact (assuming it is fixed so NEW jobs get a reasonable non null id). Assuming I understood this. 

          Michael Neale added a comment -

          On talking with stephen - the short term more urgent thing is to fix up blue ocean https://issues.jenkins-ci.org/browse/JENKINS-48571 first - get that out. Then can look at this to warn/error out for jobdsl/future users. 

          Michael Neale added a comment - On talking with stephen - the short term more urgent thing is to fix up blue ocean https://issues.jenkins-ci.org/browse/JENKINS-48571  first - get that out. Then can look at this to warn/error out for jobdsl/future users. 

          Michael Neale added a comment - - edited

          abayer Now blue ocean is being fixed - the main thing for this is probably to have an obnoxious SEVERE error that an id isn't set (aimed at jobDSL users). I'll confirm with other users that this was the issue (mostly they seem to be jobDSL users)

          Michael Neale added a comment - - edited abayer Now blue ocean is being fixed - the main thing for this is probably to have an obnoxious SEVERE error that an id isn't set (aimed at jobDSL users). I'll confirm with other users that this was the issue (mostly they seem to be jobDSL users)

          Felipe Santos added a comment - - edited

          I think JENKINS-61469 is caused due to this.

          Felipe Santos added a comment - - edited I think  JENKINS-61469  is caused due to this.

            Unassigned Unassigned
            stephenconnolly Stephen Connolly
            Votes:
            2 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated: