• Icon: Improvement Improvement
    • Resolution: Unresolved
    • Icon: Major Major
    • p4-plugin
    • None
    • Jenkins Core 1.488
      Perforce 1.3.17

      Please enhance the change reporting when a build occurs, to only show relevant files and changelists.

      [For flexibility / backwards compatibility, perhaps this behaviour should be enabled by a checkbox]

      [Ideally the filtering would be done by use of p4 commands and not by the Jenkins plugin, for scalability and to limit network traffic]

      -----------
      More specific details:

      1) At /job/<name>/<build #>/changes page: only show a changelist if one or more files in that changelist are mapped to the client.

      2) At /job/<name>/<build #>/changes page: only show those files within a changelist that are mapped to the client.

      3) Set P4_CHANGELIST to the most recent changelist which included one or more files mapped to the client [I use @${ENV,var="P4_CHANGELIST"} to label the build with the last change # together with build-name-setter plugin]

          [JENKINS-15653] Filter changes by viewspec

          Rob Petti added a comment -

          The plugin already only shows the changesets that contain files that are in the currently defined workspace/view map. So #1, (and by extension #3,) is already implemented...

          The file listing for each changeset is provided by perforce itself, and the 'p4 describe' command used does not have the ability to filter that file listing. As you say, ideally perforce would do this filtering for you, but I don't believe there is such a command for doing that... Please let me know if I'm wrong.

          Rob Petti added a comment - The plugin already only shows the changesets that contain files that are in the currently defined workspace/view map. So #1, (and by extension #3,) is already implemented... The file listing for each changeset is provided by perforce itself, and the 'p4 describe' command used does not have the ability to filter that file listing. As you say, ideally perforce would do this filtering for you, but I don't believe there is such a command for doing that... Please let me know if I'm wrong.

          Ben Golding added a comment -

          Thanks Rob.

          #1: I validated this works OK in my Jenkins.

          #3: This seems to work OK if one or more changes occurred since the last build. In that case P4_CHANGELIST matches the most recent entry on the /changes page. However when /changes shows 'no changes since last build' I would expect P4_CHANGELIST to be the same as the previous build, but it's not the case. I think this is a bug.

          Ben Golding added a comment - Thanks Rob. #1: I validated this works OK in my Jenkins. #3: This seems to work OK if one or more changes occurred since the last build. In that case P4_CHANGELIST matches the most recent entry on the /changes page. However when /changes shows 'no changes since last build' I would expect P4_CHANGELIST to be the same as the previous build, but it's not the case. I think this is a bug.

          Ben Golding added a comment -

          #2: Can be done using 'p4 fstat' in place of 'p4 describe':

          p4 -c <client> fstat -Rc -e <changelist> //...

          will list only files mapped through the client view.
          [Changelist description is also included at the end]

          In case you want to also keep supporting an 'unfiltered' changes view, you can replace

          p4 describe -s <changelist>

          by simply

          p4 fstat -e <changelist>

          Further optimizations of fstat output (-> reduced network traffic).
          Measured using a moderate size real changelist on my Perforce:

          47654 bytes
          33125 bytes with -s
          12676 bytes with -s -T depotFile [p4d >= 2009.1]
          12676 bytes with -T depotFile [p4d >= 2009.1]

          [I could not find documentation of exactly when -s was added. Probably p4d < 1999]

          Ben Golding added a comment - #2: Can be done using 'p4 fstat' in place of 'p4 describe': p4 -c <client> fstat -Rc -e <changelist> //... will list only files mapped through the client view. [Changelist description is also included at the end] In case you want to also keep supporting an 'unfiltered' changes view, you can replace p4 describe -s <changelist> by simply p4 fstat -e <changelist> Further optimizations of fstat output (-> reduced network traffic). Measured using a moderate size real changelist on my Perforce: 47654 bytes 33125 bytes with -s 12676 bytes with -s -T depotFile [p4d >= 2009.1] 12676 bytes with -T depotFile [p4d >= 2009.1] [I could not find documentation of exactly when -s was added. Probably p4d < 1999]

          Rob Petti added a comment -

          Your comment on #3 is probably the same issue described in JENKINS-15515.

          rpetti@petti:~/workspace/$ time p4 describe -s 297622 | wc   
              502    1498   47760
          
          real	0m0.016s
          user	0m0.000s
          sys	0m0.004s
          rpetti@petti:~/workspace/$ time p4 fstat -Rc -e 297622 //... | wc
             5447   14360  186203
          
          real	0m1.430s
          user	0m0.004s
          sys	0m0.000s
          

          In my tests, fstat is about 100 times times slower and grossly inefficient in terms of network traffic. It's true that '-T depotFile' will reduce the amount of network traffic, but we're not able to use it because we need to support older servers. I'm not sure that using fstat is the best option here... We might need to do client-side filtering if the need is great enough.

          Rob Petti added a comment - Your comment on #3 is probably the same issue described in JENKINS-15515 . rpetti@petti:~/workspace/$ time p4 describe -s 297622 | wc 502 1498 47760 real 0m0.016s user 0m0.000s sys 0m0.004s rpetti@petti:~/workspace/$ time p4 fstat -Rc -e 297622 //... | wc 5447 14360 186203 real 0m1.430s user 0m0.004s sys 0m0.000s In my tests, fstat is about 100 times times slower and grossly inefficient in terms of network traffic. It's true that '-T depotFile' will reduce the amount of network traffic, but we're not able to use it because we need to support older servers. I'm not sure that using fstat is the best option here... We might need to do client-side filtering if the need is great enough.

          Ben Golding added a comment - - edited

          I agree, linked to JENKINS-15515

          Ben Golding added a comment - - edited I agree, linked to JENKINS-15515

          Ben Golding added a comment -

          I don't see a big time difference between fstat/describe. Each takes around 0.1 second in my tests:

          bash-2.05b$ time p4 -c VPOM6430_bgolding-e4310 describe -s 2448248 | wc
             1809    5408  225392
          
          real    0m0.085s
          user    0m0.010s
          sys     0m0.020s
          $ time p4 -c VPOM6430_bgolding-e4310 fstat -Rc -e 2448248 //... | wc
             1352    3557   46302
          
          real    0m0.060s
          user    0m0.000s
          sys     0m0.010s
          

          Anyhow I'm not really sure whether time to complete is an important metric.
          The fstat commands could be run in a background thread, in parallel with the actual build.
          Multiple commands can also be batched up with p4 -x <file>.
          [Of course, either describe or fstat could benefit from such improvements]

          Regarding network traffic / filtering efficiency: it depends what % of the files in each changelist are mapped to the client. In the worst case, fstat has almost 300% overhead vs. describe, but with -T only ~ 2%.

          Regarding -s and -T switches: I assumed your plugin already queries the server version, and adjusts p4 commands appropriately. Anyhow we should not complain about inefficiency while also taking a 'lowest common denominator' approach.

          Bottom line: Which is faster/better comes down to many variables: p4d version; bandwidth; latency; server load/performance; size of server depot; # files in typical changelist; # files in typical client; etc. So I suggest to let the user decide whether to enable filtering or not, perhaps even in a per-client advanced setting. For me, this is a very important missing feature when doing triage of broken builds, and saves much more (human) time than a few seconds (machine time) overhead of fstat commands.

          Ben Golding added a comment - I don't see a big time difference between fstat/describe. Each takes around 0.1 second in my tests: bash-2.05b$ time p4 -c VPOM6430_bgolding-e4310 describe -s 2448248 | wc 1809 5408 225392 real 0m0.085s user 0m0.010s sys 0m0.020s $ time p4 -c VPOM6430_bgolding-e4310 fstat -Rc -e 2448248 //... | wc 1352 3557 46302 real 0m0.060s user 0m0.000s sys 0m0.010s Anyhow I'm not really sure whether time to complete is an important metric. The fstat commands could be run in a background thread, in parallel with the actual build. Multiple commands can also be batched up with p4 -x <file>. [Of course, either describe or fstat could benefit from such improvements] Regarding network traffic / filtering efficiency: it depends what % of the files in each changelist are mapped to the client. In the worst case, fstat has almost 300% overhead vs. describe, but with -T only ~ 2%. Regarding -s and -T switches: I assumed your plugin already queries the server version, and adjusts p4 commands appropriately. Anyhow we should not complain about inefficiency while also taking a 'lowest common denominator' approach. Bottom line: Which is faster/better comes down to many variables: p4d version; bandwidth; latency; server load/performance; size of server depot; # files in typical changelist; # files in typical client; etc. So I suggest to let the user decide whether to enable filtering or not, perhaps even in a per-client advanced setting. For me, this is a very important missing feature when doing triage of broken builds, and saves much more (human) time than a few seconds (machine time) overhead of fstat commands.

          Rob Petti added a comment -

          Unfortunately, checkout processes cannot be forked into the background to run parallel with the build. All checkout operations need to be completed during the checkout phase.

          I assumed you were having some kind of issue with the way it currently pulls data from perforce when you brought up network efficiency... Sorry for the confusion. If it's not that big of a deal, then I'd like to just do the filtering on the client-side as it's pulling data down.

          As for having a separate option to control this, I don't think it's really necessary. It's perfectly reasonable for this to be the default, and most people won't even notice because they don't check in across multiple projects at the same time.

          Rob Petti added a comment - Unfortunately, checkout processes cannot be forked into the background to run parallel with the build. All checkout operations need to be completed during the checkout phase. I assumed you were having some kind of issue with the way it currently pulls data from perforce when you brought up network efficiency... Sorry for the confusion. If it's not that big of a deal, then I'd like to just do the filtering on the client-side as it's pulling data down. As for having a separate option to control this, I don't think it's really necessary. It's perfectly reasonable for this to be the default, and most people won't even notice because they don't check in across multiple projects at the same time.

          Ben Golding added a comment -

          Server side filtering would be more efficient in many real cases, such as an integrate to create a new branch, or an automated nightly integrate between branches (we do both): i.e. when you're only interested in a few files from a big changelist. It also means you don't have to implement your own client side filter algorithm which is just duplicating Perforce's own functionality, and a potential home where new bugs can nest

          I don't see an advantage to the client side filtering, except that you save a few seconds and some unnecessary traffic on older servers (we use 2011.1 where I could measure no real difference).

          From my side, either should be acceptable in practice - but client-side seems like more work for little payback. IMHO better to spend the time adding server version detection (so you can use fstat -T and many other new features) than worry about performance with older versions.

          Ben Golding added a comment - Server side filtering would be more efficient in many real cases, such as an integrate to create a new branch, or an automated nightly integrate between branches (we do both): i.e. when you're only interested in a few files from a big changelist. It also means you don't have to implement your own client side filter algorithm which is just duplicating Perforce's own functionality, and a potential home where new bugs can nest I don't see an advantage to the client side filtering, except that you save a few seconds and some unnecessary traffic on older servers (we use 2011.1 where I could measure no real difference). From my side, either should be acceptable in practice - but client-side seems like more work for little payback. IMHO better to spend the time adding server version detection (so you can use fstat -T and many other new features) than worry about performance with older versions.

          Ben Golding added a comment -

          Finally, just to clarify: I don't have an issue with the way data is currently pulled from Perforce.

          As a programmer it's natural when suggesting a new feature, to immediately start thinking about the best way to implement it

          Ben Golding added a comment - Finally, just to clarify: I don't have an issue with the way data is currently pulled from Perforce. As a programmer it's natural when suggesting a new feature, to immediately start thinking about the best way to implement it

          Rob Petti added a comment -

          I compared fstat performance on several systems across several versions of perforce (including 2012.1) and they all perform poorly compared to describe -s. This makes a lot of sense to me, since fstat needs to query several databases in order to get the information it needs, whereas describe just needs to query one or two.

          Performance aside, I suggested using client-side filtering because it's actually easier and quicker to implement. The filtering code is already in the plugin, and is being actively used for some features. It would be far simpler to just use that instead of writing a new parsing algorithm to handle fstat output.

          Just my 2c... I honestly don't mind how it's implemented as long as it doesn't break anything.

          Rob Petti added a comment - I compared fstat performance on several systems across several versions of perforce (including 2012.1) and they all perform poorly compared to describe -s. This makes a lot of sense to me, since fstat needs to query several databases in order to get the information it needs, whereas describe just needs to query one or two. Performance aside, I suggested using client-side filtering because it's actually easier and quicker to implement. The filtering code is already in the plugin, and is being actively used for some features. It would be far simpler to just use that instead of writing a new parsing algorithm to handle fstat output. Just my 2c... I honestly don't mind how it's implemented as long as it doesn't break anything.

            Unassigned Unassigned
            bgolding Ben Golding
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated: