Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-31039

Queue items are ephemeral leading to api users to lose track of builds.

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Reopened (View Workflow)
    • Priority: Blocker
    • Resolution: Unresolved
    • Component/s: core
    • Labels:
      None
    • Environment:
      Jenkins ver. 1.565.3
    • Similar Issues:

      Description

      This issue is in response to an issue while utilizing the python API. The corresponding bug for the python api is located at: https://github.com/salimfadhley/jenkinsapi/issues/335
      The issue is that after a build has been invoked, we have a queue item url returned which can then be polled for new/more information like build number and where to find more information about the build after an actual build has started. However, after the job has starts building the queue item URL is removed after about 30 or so seconds (I assume because the queue has not been garbage collected yet). This leaves api users who haven't yet polled the queue item for what job their queue item turned into scratching their head wondering what happened to their queue item. In lay man's terms, "Their build job disappeared".
      example:
      Invoke a build using curl.

      $ curl -i -X POST  http://jenkins.example.com/job/Blank_Job/buildWithParameters --user $username:$password
      HTTP/1.1 201 Created
      Server: Apache-Coyote/1.1
      Location: http://jenkins.example.com/queue/item/336/
      Content-Length: 0
      Date: Thu, 18 Dec 2014 03:03:40 GMT
      

      Noticed that a queue item has been returned back as the location. Provided that the build hasn't started or the queue item garbage collected. Querying the URL returns the build number as well as the build URL.

      $ curl -X POST   http://jenkins.example.com/queue/item/336/api/xml --user $username:$password | grep -i '<executable>.*</executable>' --color -o
        % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                       Dload  Upload   Total   Spent    Left  Speed
      100   992  100   992    0     0   2278      0 --:--:-- --:--:-- --:--:--  2517
      <executable><number>487</number><url>http://jenkins/job/Blank_Job/487/</url></executable>
      

      However, querying the same queue item moments later returns an error 404. Once again, to reiterate that there is some timing to this issue, if the api user queries the queue item on relatively fast interval (about every second or so), before the queue item can be garbage collected, this doesn't normally become a problem. However, it does become a problem for api users who don't want or can't poll on a secondly basis in an attempt to beat the race condition.

      I can see several solutions to this:

      1. Items from the queue are never removed
      2. Guarantee that the url will still return information for a much greater interval (24 HRS might be overkill, but several minutes shouldn't be too difficult).
      3. Leave the removal of items in the queue to the api users
      4. Allow api users to lock the queue item with a time stamp signaling jenkins that the queue item needs to stay alive at least until that time stamp before removing the item. This way, if an api user knows that he isn't going to be asking for the information of the job for quite some time, he can tell jenkins about this before hand. Kind of a, "Hey, I will be checking on this around this time" type of a deal.

      This will allow api users to pick up the bread crumbs on what is happening to their build.

      See also the following issue:
      https://issues.jenkins-ci.org/browse/JENKINS-12827
      Particularly the comment:
      https://issues.jenkins-ci.org/browse/JENKINS-12827?focusedCommentId=218307&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-218307

        Attachments

          Activity

          Hide
          khirod khirod panda added a comment -

          it's 2.60 version.  

           

          Show
          khirod khirod panda added a comment - it's 2.60 version.    
          Hide
          bane73 Brandon Gresham added a comment -

          Daniel Beck

          My confusion about which of these behavior this issue is about is caused by conflation like this, so I recommend you file a new issue for that

           

          My apologies. I understand what you are driving at now. The issue  is similar to the one I'm talking about, but you're right: not exactly the same. I'm not going to file a separate ticket b/c at this time I don't require this RFE and I'm an extremely minor user of Jenkins API. I'll revisit later if situation warrants.

          Thank you again for the fabulous work you guys do – my team loves Jenkins and for some stupid reason are constantly having to defend it to mgmt against MS and other costlier competitors.  

          Show
          bane73 Brandon Gresham added a comment - Daniel Beck My confusion about which of these behavior this issue is about is caused by conflation like this, so I recommend you file a new issue for that   My apologies. I understand what you are driving at now. The issue  is similar to the one I'm talking about, but you're right: not exactly the same. I'm not going to file a separate ticket b/c at this time I don't require this RFE and I'm an extremely minor user of Jenkins API. I'll revisit later if situation warrants. Thank you again for the fabulous work you guys do – my team loves Jenkins and for some stupid reason are constantly having to defend it to mgmt against MS and other costlier competitors.  
          Hide
          danielbeck Daniel Beck added a comment -

          khirod panda is the user who tries to access the queue item authenticated as a user with Overall/Administer permission?

          (For the record, 2.73.2 included a fix that results in queue items being inaccessible if the user attempting to access the queue item has no permission to do so, but that's very recent, and wasn't in 2.60(.x). Still, not sure where else to start here.)

          Show
          danielbeck Daniel Beck added a comment - khirod panda is the user who tries to access the queue item authenticated as a user with Overall/Administer permission? (For the record, 2.73.2 included a fix that results in queue items being inaccessible if the user attempting to access the queue item has no permission to do so, but that's very recent, and wasn't in 2.60(.x). Still, not sure where else to start here.)
          Hide
          weiwongfaye wei wang added a comment -

          I am also experiencing this issue.  IMO, it is very common to trigger job remotely but I don't find a way to retrieve the build id reliably.  The jenkinsapi's invoke method return queue object which can get build number from it later. however, this issue makes the approach unreliable either.

           

          Show
          weiwongfaye wei wang added a comment - I am also experiencing this issue.  IMO, it is very common to trigger job remotely but I don't find a way to retrieve the build id reliably.  The jenkinsapi's invoke method return queue object which can get build number from it later. however, this issue makes the approach unreliable either.  
          Hide
          nihanth Nihanth Pabba added a comment -

          After triggering build using remote api ,

          Using , groovy Event listener plugin , does QueueListener Event helps in getting the build number of particular queueId?

          Show
          nihanth Nihanth Pabba added a comment - After triggering build using remote api , Using , groovy Event listener plugin , does QueueListener Event helps in getting the build number of particular queueId?

            People

            Assignee:
            Unassigned Unassigned
            Reporter:
            rusty Russell Weber
            Votes:
            13 Vote for this issue
            Watchers:
            27 Start watching this issue

              Dates

              Created:
              Updated: