Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-49149

Use Github REST API caching with last build timestamp to avoid drain of quota allowance

      I asked around the Github product forum, and it seems that per [https://platform.github.community/t/should-rest-api-quotas-be-above-5k-for-larger-projects/4643/8] it suffices to not really track a big bad complex cache at all:

      >> jimklimov: I think for this use-case, it would suffice for their GitHub plugin
      >> to ask its questions with If-Modified-Since timestamp of last run of this job –
      >> no real caching needed, right?

      > gr2m12d : Yes, exactly. You would send the timestamp since the last time you
      > pulled in new data and the GitHub API will respond with 304 if there are no
      > new changes, which will not count against your quota.

      Since Jenkins has histories of its recent job runs, be it latest builds with a checkout of some branch (or ephemeral branch for PRs), or an organization/repo scan to find the new repos and branches and PRs, each tip of such history has the timestamp we need to ask for If-Modified-Since. Maybe give or take a few seconds around that job's last successful starting timestamp, for example.

      This approach avoids the apparently nasty history with a cachefile earlier in this project's lifetime, and reasonably benefits from not chewing the Github quotas. Just post same requests with an additional header line.

       

      Additional literature

       

      ( Originally posted at https://github.com/jglick/github-branch-source-plugin/issues/8 )

          [JENKINS-49149] Use Github REST API caching with last build timestamp to avoid drain of quota allowance

          Jim Klimov created issue -
          Jim Klimov made changes -
          Description Original: I asked around the Github product forum, and it seems that per [{color:#000080}https://platform.github.community/t/should-rest-api-quotas-be-above-5k-for-larger-projects/4643/8{color}] it suffices to not really track a big bad complex cache at all:

          {{> jimklimov: I think for this use-case, it would suffice for their GitHub plugin > to ask its questions with If-Modified-Since timestamp of last run of this job – > no real caching needed, right?}}

          {{ gr2m12d : Yes, exactly. You would send the timestamp since the last time you pulled in new data and the GitHub API will respond with 304 if there are no new changes, which will not count against your quota. }}

          Since Jenkins has histories of its recent job runs, be it latest builds with a checkout of some branch (or ephemeral branch for PRs), or an organization/repo scan to find the new repos and branches and PRs, each tip of such history has the timestamp we need to ask for {{If-Modified-Since}}. Maybe give or take a few seconds around that job's last successful starting timestamp, for example.

          This approach avoids the apparently nasty history with a cachefile earlier in this project's lifetime, and reasonably benefits from not chewing the Github quotas. Just post same requests with an additional header line.

           

          Additional literature :)
           * [{color:#000080}https://developer.github.com/v3/activity/events/{color}] -- maybe polling these can help discover recent... events? Such as maybe new commits or new branch heads (not sure, but still worth checking)
           * [{color:#000080}https://developer.github.com/v3/#conditional-requests{color}]

           

          _( Originally posted at [https://github.com/jglick/github-branch-source-plugin/issues/8] )_
          New: I asked around the Github product forum, and it seems that per [{color:#000080}[https://platform.github.community/t/should-rest-api-quotas-be-above-5k-for-larger-projects/4643/8]{color}] it suffices to not really track a big bad complex cache at all:

          {{>> jimklimov: I think for this use-case, it would suffice for their GitHub plugin > to ask its questions with If-Modified-Since timestamp of last run of this job – > no real caching needed, right?}}

          {{> gr2m12d : Yes, exactly. You would send the timestamp since the last time you pulled in new data and the GitHub API will respond with 304 if there are no new changes, which will not count against your quota.}}

          Since Jenkins has histories of its recent job runs, be it latest builds with a checkout of some branch (or ephemeral branch for PRs), or an organization/repo scan to find the new repos and branches and PRs, each tip of such history has the timestamp we need to ask for {{If-Modified-Since}}. Maybe give or take a few seconds around that job's last successful starting timestamp, for example.

          This approach avoids the apparently nasty history with a cachefile earlier in this project's lifetime, and reasonably benefits from not chewing the Github quotas. Just post same requests with an additional header line.

           

          Additional literature :)
           * [{color:#000080}[https://developer.github.com/v3/activity/events/]{color}] – maybe polling these can help discover recent... events? Such as maybe new commits or new branch heads (not sure, but still worth checking)
           * [{color:#000080}[https://developer.github.com/v3/#conditional-requests]{color}]

           

          _( Originally posted at [https://github.com/jglick/github-branch-source-plugin/issues/8] )_
          Jim Klimov made changes -
          Description Original: I asked around the Github product forum, and it seems that per [{color:#000080}[https://platform.github.community/t/should-rest-api-quotas-be-above-5k-for-larger-projects/4643/8]{color}] it suffices to not really track a big bad complex cache at all:

          {{>> jimklimov: I think for this use-case, it would suffice for their GitHub plugin > to ask its questions with If-Modified-Since timestamp of last run of this job – > no real caching needed, right?}}

          {{> gr2m12d : Yes, exactly. You would send the timestamp since the last time you pulled in new data and the GitHub API will respond with 304 if there are no new changes, which will not count against your quota.}}

          Since Jenkins has histories of its recent job runs, be it latest builds with a checkout of some branch (or ephemeral branch for PRs), or an organization/repo scan to find the new repos and branches and PRs, each tip of such history has the timestamp we need to ask for {{If-Modified-Since}}. Maybe give or take a few seconds around that job's last successful starting timestamp, for example.

          This approach avoids the apparently nasty history with a cachefile earlier in this project's lifetime, and reasonably benefits from not chewing the Github quotas. Just post same requests with an additional header line.

           

          Additional literature :)
           * [{color:#000080}[https://developer.github.com/v3/activity/events/]{color}] – maybe polling these can help discover recent... events? Such as maybe new commits or new branch heads (not sure, but still worth checking)
           * [{color:#000080}[https://developer.github.com/v3/#conditional-requests]{color}]

           

          _( Originally posted at [https://github.com/jglick/github-branch-source-plugin/issues/8] )_
          New: I asked around the Github product forum, and it seems that per [{color:#000080}[https://platform.github.community/t/should-rest-api-quotas-be-above-5k-for-larger-projects/4643/8]{color}] it suffices to not really track a big bad complex cache at all:

          {{>> jimklimov: I think for this use-case, it would suffice for their GitHub plugin}}
          {{>> to ask its questions with If-Modified-Since timestamp of last run of this job –}}
          {{>> no real caching needed, right?}}

          {{> gr2m12d : Yes, exactly. You would send the timestamp since the last time you}}
          {{> pulled in new data and the GitHub API will respond with 304 if there are no}}
          {{> new changes, which will not count against your quota.}}

          Since Jenkins has histories of its recent job runs, be it latest builds with a checkout of some branch (or ephemeral branch for PRs), or an organization/repo scan to find the new repos and branches and PRs, each tip of such history has the timestamp we need to ask for {{If-Modified-Since}}. Maybe give or take a few seconds around that job's last successful starting timestamp, for example.

          This approach avoids the apparently nasty history with a cachefile earlier in this project's lifetime, and reasonably benefits from not chewing the Github quotas. Just post same requests with an additional header line.

           

          Additional literature :)
           * [{color:#000080}[https://developer.github.com/v3/activity/events/]{color}] – maybe polling these can help discover recent... events? Such as maybe new commits or new branch heads (not sure, but still worth checking)
           * [{color:#000080}[https://developer.github.com/v3/#conditional-requests]{color}]

           

          _( Originally posted at [https://github.com/jglick/github-branch-source-plugin/issues/8] )_
          Jim Klimov made changes -
          Assignee New: Andrew Bayer [ abayer ]
          Andrew Bayer made changes -
          Assignee Original: Andrew Bayer [ abayer ]

            Unassigned Unassigned
            jimklimov Jim Klimov
            Votes:
            1 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated: