Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-55738

NodeMonitor out of disk space message is wrong for master

    • Jenkins 2.164

      When the master goes out of disk space, it is put offline by the hudson.node_monitors.AbstractDiskSpaceMonitor system.

      The message has a blank where the node name should be:

      Putting back online as there is enough disk space again

      I think this is just because c.getName() returns an empty string in https://github.com/jenkinsci/jenkins/blob/9d61a9a13171c94e55779d166d81599cdd0f9cb7/core/src/main/java/hudson/node_monitors/AbstractDiskSpaceMonitor.java#L43-L71.

      Acceptance criteria

      • This should show master instead of a space as the node name, i.e.

      Putting back master online as there is enough disk space

        1. Capture5 - Copy.JPG
          Capture5 - Copy.JPG
          159 kB
        2. image-2019-02-09-00-30-55-101.png
          image-2019-02-09-00-30-55-101.png
          193 kB
        3. Capture2 - Copy.JPG
          Capture2 - Copy.JPG
          193 kB
        4. Capture4 - Copy.JPG
          Capture4 - Copy.JPG
          127 kB
        5. Capture3 - Copy.JPG
          Capture3 - Copy.JPG
          181 kB
        6. Capture3.JPG
          Capture3.JPG
          121 kB
        7. Capture1.JPG
          Capture1.JPG
          59 kB
        8. Capture2.JPG
          Capture2.JPG
          125 kB
        9. Capture1.JPG
          Capture1.JPG
          59 kB

          [JENKINS-55738] NodeMonitor out of disk space message is wrong for master

          Nisarg Shah added a comment -

          I would like to work on this issue. So can I assign this to me?

          Nisarg Shah added a comment - I would like to work on this issue. So can I assign this to me?

          nisarg14 absolutely, please do. Thanks! If you need help, please reach out to us either on IRC, on on the developers mailing list.

          Thanks again!

          Baptiste Mathus added a comment - nisarg14 absolutely, please do. Thanks! If you need help, please reach out to us either on IRC, on on the developers mailing list. Thanks again!

          Nisarg Shah added a comment -

          Sure. Thank you!!

          Actually I am new to open source and also very much excited to contribute.

          Nisarg Shah added a comment - Sure. Thank you!! Actually I am new to open source and also very much excited to contribute.

          Jesse Glick added a comment -

          I think it would suffice to use Computer.getDisplayName, which is overridden for MasterComputer.

          Jesse Glick added a comment - I think it would suffice to use Computer.getDisplayName , which is overridden for MasterComputer .

          Nisarg Shah added a comment -

          I want to ask few doubts. So which gitter room should I join to discuss? Or should I ask and discuss here?

          Nisarg Shah added a comment - I want to ask few doubts. So which gitter room should I join to discuss? Or should I ask and discuss here?

          Jesse Glick added a comment -

          https://gitter.im/jenkinsci/jenkins is always available for live chat, though I generally feel that if you have a specific question about a specific JIRA issue, commenting in that issue is best unless you think it is a broader topic that many people might want to participate in.

          Jesse Glick added a comment - https://gitter.im/jenkinsci/jenkins is always available for live chat, though I generally feel that if you have a specific question about a specific JIRA issue, commenting in that issue is best unless you think it is a broader topic that many people might want to participate in.

          Nisarg Shah added a comment -

          Okay. Thanks

          I will go through some necessary part of the code and will post comments here if required.

          Nisarg Shah added a comment - Okay. Thanks I will go through some necessary part of the code and will post comments here if required.

          Nisarg Shah added a comment -

          If we pass "Putting back online as there is enough disk space again" in Logger.info() then would it be a better behaviour and would it work? As if we change getName() method of Computer class than it would create a problem as this method is used in many places.

          Nisarg Shah added a comment - If we pass "Putting back online as there is enough disk space again" in Logger.info() then would it be a better behaviour and would it work? As if we change getName() method of Computer class than it would create a problem as this method is used in many places.

          Jesse Glick added a comment -

          I suspect replacing the two occurrences of c.getName() in the abovementioned code section with c.getDisplayName() would fix this. Most of the work would be figuring out how to verify the fix.

          Jesse Glick added a comment - I suspect replacing the two occurrences of c.getName() in the abovementioned code section with c.getDisplayName() would fix this. Most of the work would be figuring out how to verify the fix.

          Nisarg Shah added a comment -

          Yes it may be. But i looked each and every line where c.getName() is called and i think that it would create issue editing that method. So either we have to call another Display method or we should pass directly the string as it is. And in c.getDisplayName() it gets override by returning nodeName which exactly c.getName() does. So I am not sure but it may probably not work. There are two functions of getDisplayName in which one gets override by returning nodeName.

          Nisarg Shah added a comment - Yes it may be. But i looked each and every line where c.getName() is called and i think that it would create issue editing that method. So either we have to call another Display method or we should pass directly the string as it is. And in c.getDisplayName() it gets override by returning nodeName which exactly c.getName() does. So I am not sure but it may probably not work. There are two functions of getDisplayName in which one gets override by returning nodeName.

          Jesse Glick added a comment -

          The reported issue is about MasterComputer, which overrides getDisplayName as previously noted.

          Jesse Glick added a comment - The reported issue is about MasterComputer , which overrides getDisplayName as previously noted.

          Nisarg Shah added a comment -

          Yes issue is of MasterComputer. But if we replace c.getName() with c.getDisplayName() than would it fix this issue? 

          Nisarg Shah added a comment - Yes issue is of MasterComputer. But if we replace c.getName() with c.getDisplayName() than would it fix this issue? 

          Jesse Glick added a comment -

          Maybe you should try it and find out.

          Jesse Glick added a comment - Maybe you should try it and find out.

          Nisarg Shah added a comment -

          https://github.com/jenkinsci/jenkins/pull/3863

          In these pull request 4 existing tests got failed. Can you please help me in figuring out what happened? As no code of that modules were changed.

          Nisarg Shah added a comment - https://github.com/jenkinsci/jenkins/pull/3863 In these pull request 4 existing tests got failed. Can you please help me in figuring out what happened? As no code of that modules were changed.

          Nisarg Shah added a comment -

          Can anyone tell me how can I run tests on my local computer?

          Nisarg Shah added a comment - Can anyone tell me how can I run tests on my local computer?

          Jesse Glick added a comment -
          mvn -DskipTests clean install && mvn -f test surefire:test -Dtest=WhateverTest\#method
          

          if there is a specific test you are interested in, though you can also just use CI (failures look unrelated to your code). In this case I doubt there is any relevant automated test coverage—the issue would need to be verified manually by actually simulating an out of disk space event.

          Jesse Glick added a comment - mvn -DskipTests clean install && mvn -f test surefire:test -Dtest=WhateverTest\#method if there is a specific test you are interested in, though you can also just use CI (failures look unrelated to your code). In this case I doubt there is any relevant automated test coverage—the issue would need to be verified manually by actually simulating an out of disk space event.

          Nisarg Shah added a comment -

          Okay Thanks...

          I think here the behaviour of changed code is not okay and the tests failed are unrelated to my code. So now how should I approach to solve this?

          Nisarg Shah added a comment - Okay Thanks... I think here the behaviour of changed code is not okay and the tests failed are unrelated to my code. So now how should I approach to solve this?

          Nisarg Shah added a comment -

          What about passing a string message directly into Logger.info(). I think by this change no behaviour of code will get changed and problem can also be solved?

          Nisarg Shah added a comment - What about passing a string message directly into Logger.info(). I think by this change no behaviour of code will get changed and problem can also be solved?

          Nisarg Shah added a comment - - edited

          In my pull request yesterday testing was being done and it passed all the tests. And a green tick was given that all tests cases passed and this commit can be build. But today tests were running once again and one error occurred which is different from the errors occurred in earlier test. Error is of Remote call on jenkinsinfra-highmem8ca11 failed

          Nisarg Shah added a comment - - edited In my pull request yesterday testing was being done and it passed all the tests. And a green tick was given that all tests cases passed and this commit can be build. But today tests were running once again and one error occurred which is different from the errors occurred in earlier test. Error is of Remote call on jenkinsinfra-highmem8ca11 failed

          Nisarg Shah added a comment -

          I just need to verify the changes done by doing tests as I had a look on code once again and thought that the changes done in the pr looks fine just needed to get verify. Are there any specific tests needed for this through which I can get to know that whether it works or not?

           

          Nisarg Shah added a comment - I just need to verify the changes done by doing tests as I had a look on code once again and thought that the changes done in the pr looks fine just needed to get verify. Are there any specific tests needed for this through which I can get to know that whether it works or not?  

          Jesse Glick added a comment -

          here the behaviour of changed code is not okay

          Why do you say that?

          Error is of Remote call on jenkinsinfra-highmem8ca11 failed

          Likely a problem in CI infrastructure, not the code being tested.

          Are there any specific tests needed for this through which I can get to know that whether it works or not?

          As previously mentioned, I am not aware of any. The issue relates solely to the user interface and test coverage of the UI is low.

          Jesse Glick added a comment - here the behaviour of changed code is not okay Why do you say that? Error is of Remote call on jenkinsinfra-highmem8ca11 failed Likely a problem in CI infrastructure, not the code being tested. Are there any specific tests needed for this through which I can get to know that whether it works or not? As previously mentioned, I am not aware of any. The issue relates solely to the user interface and test coverage of the UI is low.

          Nisarg Shah added a comment -

          Okay. So I think I should check from where .getDisplayName() is assigned to the particular message. From that we can be sure whether this particular change will work or not.

          Nisarg Shah added a comment - Okay. So I think I should check from where .getDisplayName() is assigned to the particular message. From that we can be sure whether this particular change will work or not.

          Nisarg Shah added a comment -

          The tests which got failed while checking got passed now, but now I am only facing some problem to verify the fix by manually creating an out of disk space event.

          Nisarg Shah added a comment - The tests which got failed while checking got passed now, but now I am only facing some problem to verify the fix by manually creating an out of disk space event.

          Baptiste Mathus added a comment - - edited

          nisarg14 are you on Windows or Linux?

          On Linux, the usual way for this is simply to use fallocate or dd, see https://stackoverflow.com/a/5688625/345845.

          On Windows, the simplest low-tech way I did that in the past is the following:

          • find a big enough file, or very big if you have a lot of free disk space
          • copy paste it as many times as needed, even using the UI and Ctrl-C Ctrl-V should be quick enough
            • i.e. if you have 10 GB of free disk space, find a big file, like 1GB, and copy-paste it like nine or ten times

          Or maybe try https://blogs.msdn.microsoft.com/oldnewthing/20150710-00/?p=45171 but didn't try it myself.

          HTH

          Baptiste Mathus added a comment - - edited nisarg14 are you on Windows or Linux? On Linux, the usual way for this is simply to use fallocate or dd , see https://stackoverflow.com/a/5688625/345845 . On Windows, the simplest low-tech way I did that in the past is the following: find a big enough file, or very big if you have a lot of free disk space copy paste it as many times as needed, even using the UI and Ctrl-C Ctrl-V should be quick enough i.e. if you have 10 GB of free disk space, find a big file, like 1GB, and copy-paste it like nine or ten times Or maybe try https://blogs.msdn.microsoft.com/oldnewthing/20150710-00/?p=45171 but didn't try it myself. HTH

          Nisarg Shah added a comment -

          batmat Okay, great I had a thought of copying and pasting big files and manually create an out of disk event. But actually after this I have a doubt that how to run jenkins code on local and test that fixed part?

          Nisarg Shah added a comment - batmat Okay, great I had a thought of copying and pasting big files and manually create an out of disk event. But actually after this I have a doubt that how to run jenkins code on local and test that fixed part?

          Nisarg Shah added a comment - - edited

          batmat jglick

          Due to some issues I have deleted my previous repo and due to that my old PR got deleted.

          Here is new PR : https://github.com/jenkinsci/jenkins/pull/3874

          Nisarg Shah added a comment - - edited batmat jglick Due to some issues I have deleted my previous repo and due to that my old PR got deleted. Here is new PR :  https://github.com/jenkinsci/jenkins/pull/3874

          Nisarg Shah added a comment -

          batmat

          Can you suggest me the command(applied on Jenkins Folder) of test through which I can check this fix?

          I have checked some flow of the code and want to check whether this fix would actually work or not.

          Nisarg Shah added a comment - batmat Can you suggest me the command(applied on Jenkins Folder) of test through which I can check this fix? I have checked some flow of the code and want to check whether this fix would actually work or not.

          Answered in the open GitHub Pull Request.

          Baptiste Mathus added a comment - Answered in the open GitHub Pull Request.

          Nisarg Shah added a comment -

          batmat

          I have tested he fix and attaching screenshot of the output of Jenkins Window and CLI.

          And I think this fix has resolved the issue of displaying null message.

           

           

          Nisarg Shah added a comment - batmat I have tested he fix and attaching screenshot of the output of Jenkins Window and CLI. And I think this fix has resolved the issue of displaying null message.    

          nisarg14 the goal here is to fix the message explained in the description.

          See the acceptance criteria above.
          You need to see how Jenkins reacts when you remove the big files and have enough disk space again.

          Baptiste Mathus added a comment - nisarg14 the goal here is to fix the message explained in the description. See the acceptance criteria above. You need to see how Jenkins reacts when you remove the big files and have enough disk space again.

          Nisarg Shah added a comment -

          batmat

          I am attaching the screenshot of the message before the fix : 

          I have marked the issue with red marker.

          Nisarg Shah added a comment - batmat I am attaching the screenshot of the message before the fix :  I have marked the issue with red marker.

          Nisarg Shah added a comment -

          batmat

          I am attaching the screenshot of message after the fix.

          And I think the issue is solved as per acceptance criteria.

           

          Nisarg Shah added a comment - batmat I am attaching the screenshot of message after the fix. And I think the issue is solved as per acceptance criteria.  

          Oleg Nenashev added a comment -

          The fix has been released in Jenkins 2.164. Thanks nisarg14!

          Oleg Nenashev added a comment - The fix has been released in Jenkins 2.164. Thanks nisarg14 !

            nisarg14 Nisarg Shah
            batmat Baptiste Mathus
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: