Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-35821

API to return per node averages, and ETA based on history of execution

    XMLWordPrintable

    Details

    • Similar Issues:
    • Sprint:
      1.0-m7

      Description

      In scope

        Attachments

          Issue Links

            Activity

            Hide
            michaelneale Michael Neale added a comment -

            Vivek Pandey is this really redundant now that the real work is to move to bismuth api? if so, feel free to close it.

            Show
            michaelneale Michael Neale added a comment - Vivek Pandey is this really redundant now that the real work is to move to bismuth api? if so, feel free to close it.
            Hide
            vivek Vivek Pandey added a comment -

            Michael Neale Bismuth API gives basic structure to parse executed flow nodes in SAX like parser, it also gives structure for timing info. Lets keep this ticket open as we need to implement these functionalities in blueocean api.

            Show
            vivek Vivek Pandey added a comment - Michael Neale Bismuth API gives basic structure to parse executed flow nodes in SAX like parser, it also gives structure for timing info. Lets keep this ticket open as we need to implement these functionalities in blueocean api.
            Hide
            jamesdumay James Dumay added a comment -

            Sam Van Oort is there an API todo this yet?

            Show
            jamesdumay James Dumay added a comment - Sam Van Oort is there an API todo this yet?
            Hide
            jamesdumay James Dumay added a comment -

            As discussed:

            • There is a low level API in bismuth to retrieve this data.
            • We will use the low level API in Blue Ocean to drive out the requirements of a future higher level API
            • Think of it as a PoC that lives within Blue Ocean that we could either throw away or reuse.
            • Vivek Pandey to meet with Sam Van Oort to co-ordinate how it should be done.
            Show
            jamesdumay James Dumay added a comment - As discussed: There is a low level API in bismuth to retrieve this data. We will use the low level API in Blue Ocean to drive out the requirements of a future higher level API Think of it as a PoC that lives within Blue Ocean that we could either throw away or reuse. Vivek Pandey to meet with Sam Van Oort to co-ordinate how it should be done.
            Hide
            svanoort Sam Van Oort added a comment - - edited

            Per meeting with @vivek on Friday, there are a couple parts to this, and by breaking it up into smaller pieces we can make it easier. We have a followup meeting planned in a couple weeks to synch again and hammer out more of the fine details.

            Pieces:

            1. Collect a bit more structural information in BO during flow analysis (an additional Map or two). This lets us overlay a treelike, or DOM-like overlay of flow node data (depth-limited for BO though, so simplified).
            2. Create a generic API accepting DOM-like interface, which generates mappings of similar ("homologous") parts of the flow execution across WorkflowRuns (i.e. stage 'bob' in Run #1 maps to stage 'bob' in Run #2).
            • This combines a mapping component (first stage with same name matches) with a filtering component (ex: if stage didn't complete with SUCCESSFUL, it can't be used for prediction).
            • This will support pluggable strategies to accomplish this (heuristics). This lets us do the simplest version then rip it out entirely if needed (and give Blue Ocean a basic form without much nesting, but analytics can do a fancier one if desired). Also important because mappings are very fiddly (see: tons of Stage View Bugs logged for this aspect).
            • Mappings will be fuzzy/best-case, and can fail entirely if two runs are too different.
            • For more complex mappings we'll do it recursively to simplify this – map stages against each other, then map steps within each stage.
            1. Take the homologous (similar) pieces of flows and generate predictions for run time by aggregating them. We can use the status/timing APIs. This will combine run time, pause time, and maybe status. It should be VERY simple by design. Probably just do an average or median of times (subtracting pause), maybe report an error bar if enough of them.
            2. Optional: a basic API where you can request run analysis by giving a WorkflowRun as input – this can internally use caching or an analysis thread pool (to avoid overloading the system).

            Basically what we'd do is digest a flow, then digest previous runs, try to map similar parts, then see if we have enough data for predictions, find the similar bits, then do estimates based on those. We're aiming for the simplest workable solution initially.

            Note: to map flow chunks, each flowchunk must be identified by:

            1. Label if present (for stages, parallel branches).
            2. Index within enclosing container (example, 1st FlowNode in the stage, 1st FlowNode in a parallel, 1st stage in a run).
            3. EITHER containers have a list of contained chunks (i.e. for BO a WorkflowRun will have a list of stages and parallel blocks), OR each chunk lists its parent container. This is to let us (for example) see if we have 2 parallel blocks with the same branch names vs. one parallel block.

            *Why all this?*

            1. We're seeing increasingly dynamic pipeline structures – conditional execution of parallel branches, whole stages, or more.
            2. We can't directly line up nodes by ID, because some stages may have different numbers of steps (example: retry blocks).
            3. Generating mappings has all the hard bits, so we isolate it for testability and simplicity.
            4. We can delegate to the pipeline graph analysis StatusAndTiming APIs for all the timing info
            Show
            svanoort Sam Van Oort added a comment - - edited Per meeting with @vivek on Friday, there are a couple parts to this, and by breaking it up into smaller pieces we can make it easier. We have a followup meeting planned in a couple weeks to synch again and hammer out more of the fine details. Pieces: Collect a bit more structural information in BO during flow analysis (an additional Map or two). This lets us overlay a treelike, or DOM-like overlay of flow node data (depth-limited for BO though, so simplified). Create a generic API accepting DOM-like interface, which generates mappings of similar ("homologous") parts of the flow execution across WorkflowRuns (i.e. stage 'bob' in Run #1 maps to stage 'bob' in Run #2). This combines a mapping component (first stage with same name matches) with a filtering component (ex: if stage didn't complete with SUCCESSFUL, it can't be used for prediction). This will support pluggable strategies to accomplish this (heuristics). This lets us do the simplest version then rip it out entirely if needed (and give Blue Ocean a basic form without much nesting, but analytics can do a fancier one if desired). Also important because mappings are very fiddly (see: tons of Stage View Bugs logged for this aspect). Mappings will be fuzzy/best-case, and can fail entirely if two runs are too different. For more complex mappings we'll do it recursively to simplify this – map stages against each other, then map steps within each stage. Take the homologous (similar) pieces of flows and generate predictions for run time by aggregating them. We can use the status/timing APIs. This will combine run time, pause time, and maybe status. It should be VERY simple by design. Probably just do an average or median of times (subtracting pause), maybe report an error bar if enough of them. Optional: a basic API where you can request run analysis by giving a WorkflowRun as input – this can internally use caching or an analysis thread pool (to avoid overloading the system). Basically what we'd do is digest a flow, then digest previous runs, try to map similar parts, then see if we have enough data for predictions, find the similar bits, then do estimates based on those. We're aiming for the simplest workable solution initially. Note: to map flow chunks, each flowchunk must be identified by : Label if present (for stages, parallel branches). Index within enclosing container (example, 1st FlowNode in the stage, 1st FlowNode in a parallel, 1st stage in a run). EITHER containers have a list of contained chunks (i.e. for BO a WorkflowRun will have a list of stages and parallel blocks), OR each chunk lists its parent container. This is to let us (for example) see if we have 2 parallel blocks with the same branch names vs. one parallel block. * Why all this? * We're seeing increasingly dynamic pipeline structures – conditional execution of parallel branches, whole stages, or more. We can't directly line up nodes by ID, because some stages may have different numbers of steps (example: retry blocks). Generating mappings has all the hard bits, so we isolate it for testability and simplicity. We can delegate to the pipeline graph analysis StatusAndTiming APIs for all the timing info

              People

              Assignee:
              Unassigned Unassigned
              Reporter:
              jamesdumay James Dumay
              Votes:
              1 Vote for this issue
              Watchers:
              6 Start watching this issue

                Dates

                Created:
                Updated: