-
Bug
-
Resolution: Unresolved
-
Major
-
Firefox
Chrome 64bits on Windows
Microsoft Edge
Microsoft Internet Explorer
-
Powered by SuggestiMate -
Blue Ocean 1.3, Blue Ocean 1.4 - beta 1, Blue Ocean 1.4 - beta 2
Leaving a tab open with the stage view plugin active leaks memory with about 200MB every 10 minutes. Closing the tab and forcing a GC under 'about:memory' seems to recover the memory.
- is blocked by
-
JENKINS-60227 Users other than admin but with job configure privileges are unable to configure jobs
-
- Resolved
-
- is duplicated by
-
JENKINS-47009 Memory usage on pipeline activity using browser for long time
-
- Resolved
-
- is related to
-
JENKINS-34210 Memory leak in Jenkins frontend when updating executors
-
- Open
-
[JENKINS-41558] Memory leak in Jenkins stage view
I see exactly the same behavior (with the 32bit Windows version of Firefox (release 51.x and beta 52.x)). The "fix" at the moment is to use 64bit version.
I'm running 64-bit Firefox version 51.0.1 under Linux if it makes any difference.
UPDATE: I tried leaving a tab open in Chrome 56.0.2924.76 (64-bit), and it crashes after consuming more and more memory for awhile.
We're also experiencing this issue: We have a Windows 7 PC that continuously displays the Jenkins Stage View, it serves as a wallboard/information radiator. This PC is running Chrome (64 bits) and over time the stage view tab becomes slower and slower until it finally crashes. At that point the Chrome tab has accumulated about 1.5 - 2 GB of memory. When the stage view is open and you look in the Windows task manager you'll see the memory increasing every few seconds.
Our workaround: refresh the Chrome tab every few minutes using the SuperAutoRefresh plugin/extension to keep the memory from increasing above +/- 100mb.
To add to what was said:
This happens with all major navigators: we have computers with Internet Explorer, Firefox and Chrome/Chromium in 32 / 64 bits accessing the stage view and the result is the same: an extremely fast leak that will keep eating memory until the machine isn't responsive anymore. This makes it impossible to keep an eye on the stage view at all time.
So we use the same workaround than Richard Kettelerij but this makes the stage view essentially useless as you might be interrupted by a refresh while inspecting the pipeline
Please review this, I can also confirm is happening for me, on firefox in macOS. Can barely see the page, and then have to close and clean memory.
What is the status on this bug ? I've been monitoring this issue for a while now but nothing seems to change.
I put this as a blocker because it is. Not being able to access the page where you can see if your code is correct for all platform without fear that it will eat all your memory away means the ci becomes an obstacle and not a friend in the development process. You can enhance the rest all you want, if the ui doesn't work correctly, it's for nothing
I can also confirm that this is happening in our usage of the plugin. It ought to be prioritized higher, at least to Critical, and it seems to occur regardless of browser and OS.
I see this with Chrome on Linux (using version 2.7 of the pipeline stage view). Will upgrade to 2.9 tomorrow to confirm. Here's what I'm seeing:
- Taking a heap snapshot in developer tools, it looks like the RAM is getting allocated to an attribute named 'cache'
- In the Network tab, repeated calls (every 5s or so) to /job/<jobname>/<build>/wfapi/changesets?_=<timestamp> for every build listed on the view. The timestamp increases each time, so I think each call is different.
- There are other repeated AJAX calls, but I think they're likely to be unrelated...
Sorry for the vagueness of this - I'll try to get something more concrete.
I've reproduced this with the plugin built from commit f5ce1f7. I'm less sure about which AJAX calls are relevant, however.
I haven't observed memory leaks with Blue Ocean - this is just with the Classic pipeline stage view.
timretout Thank you, that investigation may be enough on its own for me to be able to solve this. I was under the impression that the root cause was something nasty and complex with the DOM elements, but if the cause is simply the cache, that's fairly easy to solve.
michaelneale cliffmeyers sorry, to clarify: I've seen this problem with that particular commit (tip of master at time of writing), with 2.7 and in fact going back quite a while, so I'm not sure when the issue was introduced.
svanoort I tried commenting out the insertions into the "cache" variable in ui/src/main/js/model/rest-api.js but this did not work for me, so although I believe the variable is named "cache", my devtools-fu is not strong enough to find out which cache!
EDIT: my new hunch is that it's the jQuery cache, $.cache - I think there are event handlers not getting cleaned up, or something similar.
timretout yes I expect it is some structural code or dom stuff not being cleaned up, I find it hard to believe it is just data from the server that takes up so much space, but hopefully cliffmeyers will get a chance to take a look soon. Any other help appreciated.
I think I've identified three places which are creating circular references:
- ui/src/main/js/view/node-log.js - the click() handler is holding a reference to onElement
- ui/src/main/js/view/stage-logs.js - something's holding on to nodeNameBars? I think the onshow function
- ui/src/main/js/view/widgets/popover/index.js - in hover(), I think the mouseenter function references onElement
When I comment out the relevant code in these three places, the memory leaks stop.
I now think I was wrong to describe what's going on as "circular references" - there are event handlers in the jQuery cache referencing otherwise unattached DOM nodes. I think if the event handlers get cleaned up, the DOM nodes can get garbage-collected as normal.
I've got a work-in-progress patch: jenkins-pipeline.diff - this is what I'm applying to the compiled stageview.js on a Jenkins instance that shows the problem.
This stops the leaks, but breaks popover behaviour: when new data comes in every 5s, it will immediately hide any open popup. I think it also breaks the hover() handler so no new popovers are shown. But I'm trying to point the way to the bits of code which are causing the problem!
thank timretout - looks like it is getting closer. I assume when pipeline run is finished the popover behavior wouldn't be "broken" (as no new data coming in?). It may be possible to defer that cleanup code so it happens "much later" (a bit of a hack though) but this is great stuff! If this points to the leak and you can see it, that is much farther than anyone has gotten.
Nice one.
timretout this is some great input. Let me have a look at this and see if I can repro what you've found so far, I'm hoping that the fix won't take long after that.
cliffmeyers thanks - I will be happy to build/test any patch you come up with.
Did a little investigation, I think it's fair to say there is definitely a DOM leak and possibly other data being held as a result of that.
My job has 5 stages that each ping localhost for 1m each. I started with about 10 runs on the page and took a heap snapshot. Then repeated these steps ~3 times:
- Ran a build
- Waited for it to finish (~5m)
- Forced GC
- Took another heap snapshot.
Begin
18 MB usage
~500 HTMLDivElements
End
47 MB
~39,000 HTMLDivElements
A quick $$('div') yields roughly 500 elements on the screen so the initial number looks right and it's clear they are being leaked over time.
Pretty sure this is going to be event listener related but I thick the code to deregister the listener may not be targeting the original element and hence leaking. Should have an update tomorrow.
Testing Notes:
- It'd be cool if we could find a way to unit test for removal of the leak. That might not be a possible thing, but cliffmeyers and I just talked about it a moment ago. Worth at least looking into.
- Otherwise, this looks to be a pretty easy one to do a visual verification on.
Just putting this back to "Open" for now as I have a couple of other tickets w/ higher priority to fix first.
Just adding a data point in case it helps:
On linux/Firefox, I can see a running build page grow from ~40Mb to over 2Gb over the course of an hour or so. the majority of that space, ~1.6Gb is listed under dom/orphan-nodes.
Jenkins 2.190.1
Firefox 60.9.0esr
Here are some notes from when a few developers looked into this a few months back:
Whenever Stage View receives an event, it re-renders the entire table. When that happens, all of the DOM nodes from the last render pass should become unreachable and be garbage collected, but right now, that does not happen, because there are a bunch of references to those nodes from the page-wide instance of JQuery (definitely via event listeners, but also via other JQuery-internal paths which we did not fully understand), so the old nodes are retained forever, causing the memory leak.
We did not see an obvious or easy way to fix this, but we are not JavaScript experts, so maybe someone with more experience in the area would be able to quickly figure out how to fix it. Maybe we need to modify all event listener registration so that the listeners are stored at the top level and can be cleared out when an event is received before re-rendering the table, maybe the version of JQuery being used needs to be updated, maybe both, etc.
We are being affected by this more and more. In fact we are seeing notebook batteries being sucked dry in 45 minutes because of this.
Our stages view spins out of control really quickly, aggregating up to 11 GB RAM in about 20 minutes. The UI also constantly consumes 10-15% of CPU time, which sounds overly high for that tiny bit of visualization; we assume it's related to the browser trying to manage the large amounts of memory.
BlueOcean, in comparison, remains stable.
We have confirmed this issue on all common Windows browsers, Firefox, IE, Edge, and Chrome, so added these to the issue's header.
From the reports above, I deduce that there is nobody actively working on this. And the issue is now > 3 years old, which is quite a shame... Can we help somehow?
That's interesting that it's so prevalent on Windows browsers. I think many of us have been able to recreate the problem elsewhere, but your description above sounds more severe than what I've experienced myself.
From the reports above, I deduce that there is nobody actively working on this. And the issue is now > 3 years old, which is quite a shame... Can we help somehow?
Yes! Yes please. If someone in the community has the willingness to try and tackle this problem, we'll make sure the PR gets reviewed thoughtfully and quickly.There are a lot of potentially useful notes in the previous comments in this ticket.
I don't believe there is much correlation to Windows browsers. Plenty of my colleagues have this issue with Linux based browsers, Chrome and Firefox. Leave a pipeline view up in a browser tab and it grows at a phenomenal rate (hundreds of KB per second).
I would also like to bring developers attention to this, as we are suffering heavily from the issue.
Reproduced on Ubuntu/Firefox
I close Jenkins tab and retrieve 7 GB
$:~/bulk-data/projects/workspace/all/loaders$ free -h total used free shared buff/cache available Mem: 15Gi 14Gi 241Mi 207Mi 751Mi 459Mi Swap: 4.0Gi 3.2Gi 860Mi ~/bulk-data/projects/workspace/all/loaders$ free -h total used free shared buff/cache available Mem: 15Gi 7.7Gi 7.3Gi 245Mi 628Mi 7.4Gi Swap: 4.0Gi 3.3Gi 709Mi ~/bulk-data/projects/workspace/all/loaders$ ing_ is done here
Added reference to possibly related issue.