Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-35821

API to return per node averages, and ETA based on history of execution

    XMLWordPrintable

Details

    • 1.0-m7

    Description

      In scope

      Attachments

        Issue Links

          Activity

            michaelneale Michael Neale added a comment -

            vivek is this really redundant now that the real work is to move to bismuth api? if so, feel free to close it.

            michaelneale Michael Neale added a comment - vivek is this really redundant now that the real work is to move to bismuth api? if so, feel free to close it.
            vivek Vivek Pandey added a comment -

            michaelneale Bismuth API gives basic structure to parse executed flow nodes in SAX like parser, it also gives structure for timing info. Lets keep this ticket open as we need to implement these functionalities in blueocean api.

            vivek Vivek Pandey added a comment - michaelneale Bismuth API gives basic structure to parse executed flow nodes in SAX like parser, it also gives structure for timing info. Lets keep this ticket open as we need to implement these functionalities in blueocean api.
            jamesdumay James Dumay added a comment -

            svanoort is there an API todo this yet?

            jamesdumay James Dumay added a comment - svanoort is there an API todo this yet?
            jamesdumay James Dumay added a comment -

            As discussed:

            • There is a low level API in bismuth to retrieve this data.
            • We will use the low level API in Blue Ocean to drive out the requirements of a future higher level API
            • Think of it as a PoC that lives within Blue Ocean that we could either throw away or reuse.
            • vivek to meet with svanoort to co-ordinate how it should be done.
            jamesdumay James Dumay added a comment - As discussed: There is a low level API in bismuth to retrieve this data. We will use the low level API in Blue Ocean to drive out the requirements of a future higher level API Think of it as a PoC that lives within Blue Ocean that we could either throw away or reuse. vivek to meet with svanoort to co-ordinate how it should be done.
            svanoort Sam Van Oort added a comment - - edited

            Per meeting with @vivek on Friday, there are a couple parts to this, and by breaking it up into smaller pieces we can make it easier. We have a followup meeting planned in a couple weeks to synch again and hammer out more of the fine details.

            Pieces:

            1. Collect a bit more structural information in BO during flow analysis (an additional Map or two). This lets us overlay a treelike, or DOM-like overlay of flow node data (depth-limited for BO though, so simplified).
            2. Create a generic API accepting DOM-like interface, which generates mappings of similar ("homologous") parts of the flow execution across WorkflowRuns (i.e. stage 'bob' in Run #1 maps to stage 'bob' in Run #2).
            • This combines a mapping component (first stage with same name matches) with a filtering component (ex: if stage didn't complete with SUCCESSFUL, it can't be used for prediction).
            • This will support pluggable strategies to accomplish this (heuristics). This lets us do the simplest version then rip it out entirely if needed (and give Blue Ocean a basic form without much nesting, but analytics can do a fancier one if desired). Also important because mappings are very fiddly (see: tons of Stage View Bugs logged for this aspect).
            • Mappings will be fuzzy/best-case, and can fail entirely if two runs are too different.
            • For more complex mappings we'll do it recursively to simplify this – map stages against each other, then map steps within each stage.
            1. Take the homologous (similar) pieces of flows and generate predictions for run time by aggregating them. We can use the status/timing APIs. This will combine run time, pause time, and maybe status. It should be VERY simple by design. Probably just do an average or median of times (subtracting pause), maybe report an error bar if enough of them.
            2. Optional: a basic API where you can request run analysis by giving a WorkflowRun as input – this can internally use caching or an analysis thread pool (to avoid overloading the system).

            Basically what we'd do is digest a flow, then digest previous runs, try to map similar parts, then see if we have enough data for predictions, find the similar bits, then do estimates based on those. We're aiming for the simplest workable solution initially.

            Note: to map flow chunks, each flowchunk must be identified by:

            1. Label if present (for stages, parallel branches).
            2. Index within enclosing container (example, 1st FlowNode in the stage, 1st FlowNode in a parallel, 1st stage in a run).
            3. EITHER containers have a list of contained chunks (i.e. for BO a WorkflowRun will have a list of stages and parallel blocks), OR each chunk lists its parent container. This is to let us (for example) see if we have 2 parallel blocks with the same branch names vs. one parallel block.

            *Why all this?*

            1. We're seeing increasingly dynamic pipeline structures – conditional execution of parallel branches, whole stages, or more.
            2. We can't directly line up nodes by ID, because some stages may have different numbers of steps (example: retry blocks).
            3. Generating mappings has all the hard bits, so we isolate it for testability and simplicity.
            4. We can delegate to the pipeline graph analysis StatusAndTiming APIs for all the timing info
            svanoort Sam Van Oort added a comment - - edited Per meeting with @vivek on Friday, there are a couple parts to this, and by breaking it up into smaller pieces we can make it easier. We have a followup meeting planned in a couple weeks to synch again and hammer out more of the fine details. Pieces: Collect a bit more structural information in BO during flow analysis (an additional Map or two). This lets us overlay a treelike, or DOM-like overlay of flow node data (depth-limited for BO though, so simplified). Create a generic API accepting DOM-like interface, which generates mappings of similar ("homologous") parts of the flow execution across WorkflowRuns (i.e. stage 'bob' in Run #1 maps to stage 'bob' in Run #2). This combines a mapping component (first stage with same name matches) with a filtering component (ex: if stage didn't complete with SUCCESSFUL, it can't be used for prediction). This will support pluggable strategies to accomplish this (heuristics). This lets us do the simplest version then rip it out entirely if needed (and give Blue Ocean a basic form without much nesting, but analytics can do a fancier one if desired). Also important because mappings are very fiddly (see: tons of Stage View Bugs logged for this aspect). Mappings will be fuzzy/best-case, and can fail entirely if two runs are too different. For more complex mappings we'll do it recursively to simplify this – map stages against each other, then map steps within each stage. Take the homologous (similar) pieces of flows and generate predictions for run time by aggregating them. We can use the status/timing APIs. This will combine run time, pause time, and maybe status. It should be VERY simple by design. Probably just do an average or median of times (subtracting pause), maybe report an error bar if enough of them. Optional: a basic API where you can request run analysis by giving a WorkflowRun as input – this can internally use caching or an analysis thread pool (to avoid overloading the system). Basically what we'd do is digest a flow, then digest previous runs, try to map similar parts, then see if we have enough data for predictions, find the similar bits, then do estimates based on those. We're aiming for the simplest workable solution initially. Note: to map flow chunks, each flowchunk must be identified by : Label if present (for stages, parallel branches). Index within enclosing container (example, 1st FlowNode in the stage, 1st FlowNode in a parallel, 1st stage in a run). EITHER containers have a list of contained chunks (i.e. for BO a WorkflowRun will have a list of stages and parallel blocks), OR each chunk lists its parent container. This is to let us (for example) see if we have 2 parallel blocks with the same branch names vs. one parallel block. * Why all this? * We're seeing increasingly dynamic pipeline structures – conditional execution of parallel branches, whole stages, or more. We can't directly line up nodes by ID, because some stages may have different numbers of steps (example: retry blocks). Generating mappings has all the hard bits, so we isolate it for testability and simplicity. We can delegate to the pipeline graph analysis StatusAndTiming APIs for all the timing info

            People

              Unassigned Unassigned
              jamesdumay James Dumay
              Votes:
              1 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated: