-
Bug
-
Resolution: Fixed
-
Trivial
-
Powered by SuggestiMate -
Jenkins 2.164
When the master goes out of disk space, it is put offline by the hudson.node_monitors.AbstractDiskSpaceMonitor system.
The message has a blank where the node name should be:
Putting back online as there is enough disk space again
I think this is just because c.getName() returns an empty string in https://github.com/jenkinsci/jenkins/blob/9d61a9a13171c94e55779d166d81599cdd0f9cb7/core/src/main/java/hudson/node_monitors/AbstractDiskSpaceMonitor.java#L43-L71.
Acceptance criteria
- This should show master instead of a space as the node name, i.e.
Putting back master online as there is enough disk space
- Capture5 - Copy.JPG
- 159 kB
- Capture2 - Copy.JPG
- 193 kB
- Capture4 - Copy.JPG
- 127 kB
- Capture3 - Copy.JPG
- 181 kB
- Capture3.JPG
- 121 kB
- Capture1.JPG
- 59 kB
- Capture2.JPG
- 125 kB
- Capture1.JPG
- 59 kB
[JENKINS-55738] NodeMonitor out of disk space message is wrong for master
nisarg14 absolutely, please do. Thanks! If you need help, please reach out to us either on IRC, on on the developers mailing list.
Thanks again!
Sure. Thank you!!
Actually I am new to open source and also very much excited to contribute.
I think it would suffice to use Computer.getDisplayName, which is overridden for MasterComputer.
I want to ask few doubts. So which gitter room should I join to discuss? Or should I ask and discuss here?
https://gitter.im/jenkinsci/jenkins is always available for live chat, though I generally feel that if you have a specific question about a specific JIRA issue, commenting in that issue is best unless you think it is a broader topic that many people might want to participate in.
Okay. Thanks
I will go through some necessary part of the code and will post comments here if required.
If we pass "Putting back online as there is enough disk space again" in Logger.info() then would it be a better behaviour and would it work? As if we change getName() method of Computer class than it would create a problem as this method is used in many places.
I suspect replacing the two occurrences of c.getName() in the abovementioned code section with c.getDisplayName() would fix this. Most of the work would be figuring out how to verify the fix.
Yes it may be. But i looked each and every line where c.getName() is called and i think that it would create issue editing that method. So either we have to call another Display method or we should pass directly the string as it is. And in c.getDisplayName() it gets override by returning nodeName which exactly c.getName() does. So I am not sure but it may probably not work. There are two functions of getDisplayName in which one gets override by returning nodeName.
The reported issue is about MasterComputer, which overrides getDisplayName as previously noted.
Yes issue is of MasterComputer. But if we replace c.getName() with c.getDisplayName() than would it fix this issue?
https://github.com/jenkinsci/jenkins/pull/3863
In these pull request 4 existing tests got failed. Can you please help me in figuring out what happened? As no code of that modules were changed.
mvn -DskipTests clean install && mvn -f test surefire:test -Dtest=WhateverTest\#method
if there is a specific test you are interested in, though you can also just use CI (failures look unrelated to your code). In this case I doubt there is any relevant automated test coverage—the issue would need to be verified manually by actually simulating an out of disk space event.
Okay Thanks...
I think here the behaviour of changed code is not okay and the tests failed are unrelated to my code. So now how should I approach to solve this?
What about passing a string message directly into Logger.info(). I think by this change no behaviour of code will get changed and problem can also be solved?
In my pull request yesterday testing was being done and it passed all the tests. And a green tick was given that all tests cases passed and this commit can be build. But today tests were running once again and one error occurred which is different from the errors occurred in earlier test. Error is of Remote call on jenkinsinfra-highmem8ca11 failed
I just need to verify the changes done by doing tests as I had a look on code once again and thought that the changes done in the pr looks fine just needed to get verify. Are there any specific tests needed for this through which I can get to know that whether it works or not?
here the behaviour of changed code is not okay
Why do you say that?
Error is of Remote call on jenkinsinfra-highmem8ca11 failed
Likely a problem in CI infrastructure, not the code being tested.
Are there any specific tests needed for this through which I can get to know that whether it works or not?
As previously mentioned, I am not aware of any. The issue relates solely to the user interface and test coverage of the UI is low.
Okay. So I think I should check from where .getDisplayName() is assigned to the particular message. From that we can be sure whether this particular change will work or not.
The tests which got failed while checking got passed now, but now I am only facing some problem to verify the fix by manually creating an out of disk space event.
nisarg14 are you on Windows or Linux?
On Linux, the usual way for this is simply to use fallocate or dd, see https://stackoverflow.com/a/5688625/345845.
On Windows, the simplest low-tech way I did that in the past is the following:
- find a big enough file, or very big if you have a lot of free disk space
- copy paste it as many times as needed, even using the UI and Ctrl-C Ctrl-V should be quick enough
- i.e. if you have 10 GB of free disk space, find a big file, like 1GB, and copy-paste it like nine or ten times
Or maybe try https://blogs.msdn.microsoft.com/oldnewthing/20150710-00/?p=45171 but didn't try it myself.
HTH
batmat Okay, great I had a thought of copying and pasting big files and manually create an out of disk event. But actually after this I have a doubt that how to run jenkins code on local and test that fixed part?
Due to some issues I have deleted my previous repo and due to that my old PR got deleted.
Here is new PR : https://github.com/jenkinsci/jenkins/pull/3874
Can you suggest me the command(applied on Jenkins Folder) of test through which I can check this fix?
I have checked some flow of the code and want to check whether this fix would actually work or not.
I have tested he fix and attaching screenshot of the output of Jenkins Window and CLI.
And I think this fix has resolved the issue of displaying null message.
nisarg14 the goal here is to fix the message explained in the description.
See the acceptance criteria above.
You need to see how Jenkins reacts when you remove the big files and have enough disk space again.
I am attaching the screenshot of the message before the fix :
I have marked the issue with red marker.
I am attaching the screenshot of message after the fix.
And I think the issue is solved as per acceptance criteria.
I would like to work on this issue. So can I assign this to me?