Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-41205

Stage graph unsuitable for large and/or complex pipelines

    XMLWordPrintable

    Details

    • Type: Improvement
    • Status: Resolved (View Workflow)
    • Priority: Major
    • Resolution: Fixed
    • Component/s: blueocean-plugin
    • Labels:
      None
    • Environment:
      Jenkins 2.40
      Blue Ocean 1.0.0-b17
    • Similar Issues:

      Description

      Improvement on roadmap

      This improvement is on the Blue Ocean project roadmap. Check the roadmap page for updates.

      The Blue Ocean stage graph is great for small, simple pipelines however it breaks down with many parallel builds. See attached screenshot for an example.

      Because 'stage' can no longer be nested within 'parallel', all of our steps must belong under a single 'Test' stage. We have 19 parallel jobs, which is not an uncommon number for iOS/Android development where many combinations of app, device and OS version need to be tested. We'd actually like to split some of the jobs into smaller chunks to take advantage of idle build agents, but this would greatly exacerbate the problem.

      Grouping jobs under multiple stages would improve the UI experience, but also drastically increase the runtime of our integration runs as stages are executed serially.

      I envision two possible solutions:

      1. Stages have a 'parallel' option that allows them to run at the same time as other parallel stages.
      2. A step is introduced that is used purely as an annotation for the purposes of rendering a more appropriate graph. Ideally the step would be deeply nestable allowing for complex graph hierarchies.

      Thanks for all the hard work on Blue Ocean, it's really shaping up nicely and I eagerly await each new release.

        Attachments

          Issue Links

            Activity

            Hide
            jamesdumay James Dumay added a comment -

            Thanks Ian Leitch! For that second point would you be interested in something like JENKINS-38442 ?

            Show
            jamesdumay James Dumay added a comment - Thanks Ian Leitch ! For that second point would you be interested in something like JENKINS-38442 ?
            Hide
            ileitch Ian Leitch added a comment -

            James Dumay It's not clear in JENKINS-38442 how the graph would be rendered. Nested stages would solve the issue for me provided they do alter the graph, however ideally both nested stages and parallel stages would provide the most flexibility.

            Show
            ileitch Ian Leitch added a comment - James Dumay It's not clear in JENKINS-38442 how the graph would be rendered. Nested stages would solve the issue for me provided they do alter the graph, however ideally both nested stages and parallel stages would provide the most flexibility.
            Hide
            jamesdumay James Dumay added a comment -

            Ian Leitch we are not even clear how things should be rendered because users ideas of how that should look are so diverse

            I think there are some things we can do here to make the existing graph more acceptable in your case. We will also be looking at how to solve the nesting problem this year but I can't give a good ETA on it yet.

            Show
            jamesdumay James Dumay added a comment - Ian Leitch we are not even clear how things should be rendered because users ideas of how that should look are so diverse I think there are some things we can do here to make the existing graph more acceptable in your case. We will also be looking at how to solve the nesting problem this year but I can't give a good ETA on it yet.
            Hide
            ileitch Ian Leitch added a comment -

            The sketch you provided: https://issues.jenkins-ci.org/secure/attachment/34100/blueocean.sketch%202016-09-28%2015-03-57.png
            would not be ideal as it only alters the steps table, not the graph. If we applied this approach to our pipeline, the graph would only display a single "Test" stage, with all of the meaningful stages being shown in the list below. This would make the graph linear, and somewhat redundant.

            Show
            ileitch Ian Leitch added a comment - The sketch you provided: https://issues.jenkins-ci.org/secure/attachment/34100/blueocean.sketch%202016-09-28%2015-03-57.png would not be ideal as it only alters the steps table, not the graph. If we applied this approach to our pipeline, the graph would only display a single "Test" stage, with all of the meaningful stages being shown in the list below. This would make the graph linear, and somewhat redundant.
            Hide
            jamesdumay James Dumay added a comment -

            Ian Leitch yeah I think that design is totally out for the reason you provided and because two steps can appear on the screen it makes the running behaviour really complex.

            Show
            jamesdumay James Dumay added a comment - Ian Leitch yeah I think that design is totally out for the reason you provided and because two steps can appear on the screen it makes the running behaviour really complex.
            Hide
            jamesdumay James Dumay added a comment -

            Ian Leitch actually would you be up for a Google Hangout in the next few weeks? We can discuss this in greater detail. Would love to see some of the Pipelines you have built and how they show up in Blue Ocean.

            Show
            jamesdumay James Dumay added a comment - Ian Leitch actually would you be up for a Google Hangout in the next few weeks? We can discuss this in greater detail. Would love to see some of the Pipelines you have built and how they show up in Blue Ocean.
            Show
            ileitch Ian Leitch added a comment - James Dumay Please see my proposal here: https://issues.jenkins-ci.org/browse/JENKINS-38442?focusedCommentId=284175&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-284175
            Hide
            ileitch Ian Leitch added a comment -

            James Dumay Sorry just noticed the Google Hangout question. Yeah I'm happy to do so, though be warned that I'm very new to Jenkins (I was tasked with switching away from TeamCity). We only have a single Pipeline at the moment.

            Show
            ileitch Ian Leitch added a comment - James Dumay Sorry just noticed the Google Hangout question. Yeah I'm happy to do so, though be warned that I'm very new to Jenkins (I was tasked with switching away from TeamCity). We only have a single Pipeline at the moment.
            Hide
            jamesdumay James Dumay added a comment -

            Oh no that's perfect that you are new to things! Can you email me on your work email? I'm jdumay@cloudbees.com

            Show
            jamesdumay James Dumay added a comment - Oh no that's perfect that you are new to things! Can you email me on your work email? I'm jdumay@cloudbees.com
            Hide
            evildeece Alastair D'Silva added a comment - - edited

            A multiaxis way of displaying steps would be nice too, consider the following:

            def hostPlatforms = ['ppc64le', 'x86_64']
            def targetPlatforms = ['x86_64', 'power32', 'ppc64le', 'ppc64be', 'arm']
            def hostOSs = ['ubuntu-16.04']
            
            def builds = [:]
            
            for (hostPlatform in hostPlatforms) {
                for (targetPlatform in targetPlatforms) {
                    if (hostPlatform.equals('ppc64le') && targetPlatform.equals('x86_64')) {
                        continue;
                    }
            
                    for (hostOS in hostOSs) {
                        def name = 'build-' + hostPlatform + '-' + targetPlatform
                        def myHostPlatform = hostPlatform
                        def myHostOS = hostOS
                        def myTargetPlatform = targetPlatform
            
                        builds[name] = {buildToolchain(myHostPlatform, myHostOS, myTargetPlatform)}
                    }
                }
            }
            
            stage('build') {
                parallel builds
            }
            

            This currently ends up with a long list scrolling down the page, but would be better represented by a display similar to the Matrix Project plugin.

            Show
            evildeece Alastair D'Silva added a comment - - edited A multiaxis way of displaying steps would be nice too, consider the following: def hostPlatforms = [ 'ppc64le' , 'x86_64' ] def targetPlatforms = [ 'x86_64' , 'power32' , 'ppc64le' , 'ppc64be' , 'arm' ] def hostOSs = [ 'ubuntu-16.04' ] def builds = [:] for (hostPlatform in hostPlatforms) { for (targetPlatform in targetPlatforms) { if (hostPlatform.equals( 'ppc64le' ) && targetPlatform.equals( 'x86_64' )) { continue ; } for (hostOS in hostOSs) { def name = 'build-' + hostPlatform + '-' + targetPlatform def myHostPlatform = hostPlatform def myHostOS = hostOS def myTargetPlatform = targetPlatform builds[name] = {buildToolchain(myHostPlatform, myHostOS, myTargetPlatform)} } } } stage( 'build' ) { parallel builds } This currently ends up with a long list scrolling down the page, but would be better represented by a display similar to the Matrix Project plugin.
            Hide
            scgruber Sam Gruber added a comment - - edited

            This pipeline of ours is unworkable in the current view (1.0.0rc3) because we cannot click through to any information on our later stages (we have approximately 250 stages total, and this view only shows the first 90). Even if the full graph is unrenderable, it would be nice to have some way of loading these stages. Just a dropdown list would be preferable to the current state.

            Show
            scgruber Sam Gruber added a comment - - edited This pipeline of ours is unworkable in the current view (1.0.0rc3) because we cannot click through to any information on our later stages (we have approximately 250 stages total, and this view only shows the first 90). Even if the full graph is unrenderable, it would be nice to have some way of loading these stages. Just a dropdown list would be preferable to the current state.
            Hide
            jamesdumay James Dumay added a comment -

            Sam Gruber great description of your use case here. I have a few questions:

            • What actions are you performing with the parallels?
            • Why 250 stages? What meaning do they have?
            Show
            jamesdumay James Dumay added a comment - Sam Gruber great description of your use case here. I have a few questions: What actions are you performing with the parallels? Why 250 stages? What meaning do they have?
            Hide
            scgruber Sam Gruber added a comment -

            James Dumay this pipeline is an integration test against our different hardware platforms. We have a sequence of tests (asdf3, asdf4, ...) which run in parallel over our hardware platforms (zxcv1, zxcv2, zxcv3, ...). So we have a stage for each combination of test and hardware, and that's how we get to 250.

            Show
            scgruber Sam Gruber added a comment - James Dumay this pipeline is an integration test against our different hardware platforms. We have a sequence of tests (asdf3, asdf4, ...) which run in parallel over our hardware platforms (zxcv1, zxcv2, zxcv3, ...). So we have a stage for each combination of test and hardware, and that's how we get to 250.
            Hide
            jamesdumay James Dumay added a comment -

            Sam Gruber very interesting! Thanks for the additional context. When we plan this feature, could we contact you for a Google Hangout to discuss the new design?

            Show
            jamesdumay James Dumay added a comment - Sam Gruber very interesting! Thanks for the additional context. When we plan this feature, could we contact you for a Google Hangout to discuss the new design?
            Hide
            scgruber Sam Gruber added a comment -

            James Dumay sure, as long as it can be scheduled during US Pacific (UTC-7) business hours.

            Show
            scgruber Sam Gruber added a comment - James Dumay sure, as long as it can be scheduled during US Pacific (UTC-7) business hours.
            Hide
            clementgautier Clement Gautier added a comment -

            Any insight on this issue ? I'm in the same situation with even more parallel steps (over 400 and growing). Basically we have so much browser tests that we need to split them a lot. Previously we were using a distributed system of our own using a queue system and consumers that upload the results back to the Jenkins master server but it kinda suck and we really want to move back to a basic master / slave management of your Jenkins.

            How can I help ? From my point of view, if we could simply say, using parallel, to not create a visual step for each branch of the parallel would be enough as a quickfix. Do you have any slack or whatever tool to talk directly ? I'm not used to Java so I might need some help with the contribution process.

            Show
            clementgautier Clement Gautier added a comment - Any insight on this issue ? I'm in the same situation with even more parallel steps (over 400 and growing). Basically we have so much browser tests that we need to split them a lot. Previously we were using a distributed system of our own using a queue system and consumers that upload the results back to the Jenkins master server but it kinda suck and we really want to move back to a basic master / slave management of your Jenkins. How can I help ? From my point of view, if we could simply say, using parallel, to not create a visual step for each branch of the parallel would be enough as a quickfix. Do you have any slack or whatever tool to talk directly ? I'm not used to Java so I might need some help with the contribution process.
            Hide
            andrewconti Andrew Conti added a comment -

            I'm in a similar situation. I've got 25 separate build combinations, some of which are virtualized and have one or more packaging nodes as well. Each artifact produced then gets a test node, and all told there's over 40 artifacts to test. All told I've got about 100 nodes so I'm guessing that must be the limit. Just my 2¢ that the limit is too low.

            Show
            andrewconti Andrew Conti added a comment - I'm in a similar situation. I've got 25 separate build combinations, some of which are virtualized and have one or more packaging nodes as well. Each artifact produced then gets a test node, and all told there's over 40 artifacts to test. All told I've got about 100 nodes so I'm guessing that must be the limit. Just my 2¢ that the limit is too low.
            Hide
            philster_jenkins Phil Clay added a comment -

            +1 on the limit of 100 being too low.  I have some pipelines that are slightly larger than 100, and this is a real pain.  My pipelines only have a few (non-nested) stages, but have lots of parallel steps per stage.  I'm fine with dealing with a slightly long UI graph if that means I get to see all the nodes.

             

            Is there any chance that a short-term solution could be put in place to allow the limit to be configurable, with no changes to the current UI graph display?  I have a feeling that refactoring the UI using some of the solutions proposed here will take quite a while, especially since the roadmap indicates it is "Not planned.

             

            I looked into it a bit.  

            The current UI makes one call to:

            http://jenkins/blue/rest/organizations/jenkins/pipelines/_job_/runs/_runId_/nodes/
            

            Internally, the nodes are retrieved via the PipelineNodeContainerImpl, which is a BluePipelineNodeContainer, which is a Container, which is a Pageable.

            Therefore, the response that is constructed is a PagedResponse, which has a default limit of 100.  Since no start/limit is passed in the request, the default start=0 and limit=100 are used.  So even though the container has all of the nodes in it, only the first 100 are returned in the HTTP response.

            The response also includes a Link header for the next page:

            Link: </blue/rest/organizations/jenkins/pipelines/_job_/runs/_runId_/nodes/?start=100&limit=100>; rel="next"
            

            However, the UI does not follow the link to retrieve the next page.

             

            While debugging, I was able to set a breakpoint on the server side, and hack around the arbitrary 100 limit to make the server return more than 100.  The UI worked just fine with the number of nodes in my pipelines.

             

            There are a few hacky ways to make all the nodes show on the current UI:

            1. Increase the default limit constant
            2. Provide a way to configure the default limit (and don't rely on a constant), therefore allowing users to modify the limit as they desire
            3. Have the call to .../nodes pass a higher limit by default... e.g. .../nodes/?limit=500
            4. Have the UI follow links to next pages (could be done by default, OR done if the user clicks on the "Unable to display more" node.

             

            In any case, a short term solution would be much appreciated.

             

             

            Show
            philster_jenkins Phil Clay added a comment - +1 on the limit of 100 being too low.  I have some pipelines that are slightly larger than 100, and this is a real pain.  My pipelines only have a few (non-nested) stages, but have lots of parallel steps per stage.  I'm fine with dealing with a slightly long UI graph if that means I get to see all the nodes.   Is there any chance that a short-term solution could be put in place to allow the limit to be configurable, with no changes to the current UI graph display?  I have a feeling that refactoring the UI using some of the solutions proposed here will take quite a while, especially since the roadmap indicates it is "Not planned.   I looked into it a bit.   The current UI makes one call to: http://jenkins/blue/rest/organizations/jenkins/pipelines/_job_/runs/_runId_/nodes/ Internally, the nodes are retrieved via the PipelineNodeContainerImpl , which is a BluePipelineNodeContainer , which is a  Container , which is a Pageable . Therefore, the response that is constructed is a  PagedResponse , which has a default limit of 100 .  Since no start/limit is passed in the request, the default start=0 and limit=100 are used.  So even though the container has all of the nodes in it, only the first 100 are returned in the HTTP response. The response also includes a Link header for the next page: Link: </blue/rest/organizations/jenkins/pipelines/_job_/runs/_runId_/nodes/?start=100&limit=100>; rel="next" However, the UI does not follow the link to retrieve the next page.   While debugging, I was able to set a breakpoint on the server side, and hack around the arbitrary 100 limit to make the server return more than 100.  The UI worked just fine with the number of nodes in my pipelines.   There are a few hacky ways to make all the nodes show on the current UI: Increase the default limit constant Provide a way to configure the default limit (and don't rely on a constant), therefore allowing users to modify the limit as they desire Have the call to .../nodes pass a higher limit by default... e.g. .../nodes/?limit=500 Have the UI follow links to next pages (could be done by default, OR done if the user clicks on the "Unable to display more" node.   In any case, a short term solution would be much appreciated.    
            Hide
            benlangfeld Ben Langfeld added a comment -

            My team has the necessary skills to contribute a fix to the UI to fetch all available pages. If we were to submit such a PR, is there a reasonable chance it would be merged for inclusion in the next BlueOcean release, or is there more roadmap politics to navigate than that?

            For anyone who is curious, the deficient object is the PipelinePager which doesn't actually page in data: https://github.com/jenkinsci/blueocean-plugin/blob/master/blueocean-dashboard/src/main/js/components/karaoke/services/pagers/PipelinePager.js past the initial page.

            Show
            benlangfeld Ben Langfeld added a comment - My team has the necessary skills to contribute a fix to the UI to fetch all available pages. If we were to submit such a PR, is there a reasonable chance it would be merged for inclusion in the next BlueOcean release, or is there more roadmap politics to navigate than that? For anyone who is curious, the deficient object is the PipelinePager which doesn't actually page in data: https://github.com/jenkinsci/blueocean-plugin/blob/master/blueocean-dashboard/src/main/js/components/karaoke/services/pagers/PipelinePager.js  past the initial page.
            Hide
            garettarrowood Garett Arrowood added a comment -

            This bug it greatly hindering my work. Could it be moved up the roadmap or could someone respond to the gentleman that offered to fix it in the above comment?

            Show
            garettarrowood Garett Arrowood added a comment - This bug it greatly hindering my work. Could it be moved up the roadmap or could someone respond to the gentleman that offered to fix it in the above comment?
            Hide
            kzantow Keith Zantow added a comment -

            Ben Langfeld if you submit a PR that fixes the issue it would absolutely be considered for inclusion; we'd just have a look and make sure tests pass, etc.. Submissions are always welcome!

            Show
            kzantow Keith Zantow added a comment - Ben Langfeld if you submit a PR that fixes the issue it would absolutely be considered for inclusion; we'd just have a look and make sure tests pass, etc.. Submissions are always welcome!
            Hide
            cliffmeyers Cliff Meyers added a comment - - edited

            Seconded. Ben Langfeld I had looked at this problem in the past. Another option is to do successive fetches until all nodes / stages are loaded. If you look at the REST responses, you'll see there is pagination data written into a "Link" response header IIRC. That's a way to determine whether there is additional data to be fetched, and you could write some logic to grab say n=100 and just perform successive fetches until the Link header indicates there is no more data.

            We may want to be careful about doing a massive fetch up front (say n=500) as for complex pipelines this might have a perf impact server wide. I recall discussing this with Vivek Pandey a while back, can you refresh my memory on whether it might be preferable to do a single large fetch (say n=500) or several smaller fetches (n=100) until all data is loaded? Intuitively fewer large fetches seems more efficient from client's perspective, but I seem to recall a concern with loading a large number of nodes concurrently in the context of a single request?

            Show
            cliffmeyers Cliff Meyers added a comment - - edited Seconded. Ben Langfeld I had looked at this problem in the past. Another option is to do successive fetches until all nodes / stages are loaded. If you look at the REST responses, you'll see there is pagination data written into a "Link" response header IIRC. That's a way to determine whether there is additional data to be fetched, and you could write some logic to grab say n=100 and just perform successive fetches until the Link header indicates there is no more data. We may want to be careful about doing a massive fetch up front (say n=500) as for complex pipelines this might have a perf impact server wide. I recall discussing this with Vivek Pandey a while back, can you refresh my memory on whether it might be preferable to do a single large fetch (say n=500) or several smaller fetches (n=100) until all data is loaded? Intuitively fewer large fetches seems more efficient from client's perspective, but I seem to recall a concern with loading a large number of nodes concurrently in the context of a single request?
            Hide
            benlangfeld Ben Langfeld added a comment -

            Following Link is precisely what I had in mind Cliff Meyers. We'll prep a patch. Thanks everyone.

            Show
            benlangfeld Ben Langfeld added a comment - Following Link is precisely what I had in mind Cliff Meyers . We'll prep a patch. Thanks everyone.
            Hide
            michaelneale Michael Neale added a comment -

            Ben Langfeld just make sure you are nice and up to date with master as some recent changes were merged for how it follows along (may not affect you, just FYI). As for fetching the pages - absolutely why not, if you have a PR that would be wonderful. Go for it!

            Show
            michaelneale Michael Neale added a comment - Ben Langfeld just make sure you are nice and up to date with master as some recent changes were merged for how it follows along (may not affect you, just FYI). As for fetching the pages - absolutely why not, if you have a PR that would be wonderful. Go for it!
            Hide
            benlangfeld Ben Langfeld added a comment -

            A patch to resolve this is proposed at https://github.com/jenkinsci/blueocean-plugin/pull/1517. I would appreciate a review, particularly from Cliff Meyers.

            Show
            benlangfeld Ben Langfeld added a comment - A patch to resolve this is proposed at https://github.com/jenkinsci/blueocean-plugin/pull/1517.  I would appreciate a review, particularly from Cliff Meyers .

              People

              Assignee:
              benlangfeld Ben Langfeld
              Reporter:
              ileitch Ian Leitch
              Votes:
              19 Vote for this issue
              Watchers:
              36 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: