After a couple of days, Hudson seems to see a a 'phantom' build queued , which does not execute at all, but it blocks the next job from running. The currently executing build is 'Collectors' and the stuck build is 'SportsEngine-Collection' . The stuck build is a Maven2 build, using the locks-and-latches plugin, polling SVN every minute.

          [JENKINS-6565] Hudson stuck on inexistent job

          Build queue on the front page does not show the phantom build, only the one from the project page does.

          Robert Munteanu added a comment - Build queue on the front page does not show the phantom build, only the one from the project page does.

          I also encountered this issue.
          The stuck build is an ant build for me.

          aurelien_pupier added a comment - I also encountered this issue. The stuck build is an ant build for me.

          I think this happens in the following scenario for me:

          • Have 2 jobs protected by the same lock;
          • Job one polls from SVN, job two runs on demand;
          • Manually start the second jobs;
          • The first job starts due to SCM polling, but is blocked;
          • Cancel the first job.

          Robert Munteanu added a comment - I think this happens in the following scenario for me: Have 2 jobs protected by the same lock; Job one polls from SVN, job two runs on demand; Manually start the second jobs; The first job starts due to SCM polling, but is blocked; Cancel the first job.

          I've narrowed it down to:

          • start job1 - protected by a lock;
          • start job2 - protected by the same lock;
          • cancel job2;

          job2 is now 'half-cancelled.

          I believe that this belongs to the locks-and-latches plugin therefore moving to that component.

          Robert Munteanu added a comment - I've narrowed it down to: start job1 - protected by a lock; start job2 - protected by the same lock; cancel job2; job2 is now 'half-cancelled. I believe that this belongs to the locks-and-latches plugin therefore moving to that component.

          Jan Schormann added a comment -

          I believe this is linked to JENKINS-6901, JENKINS-8223, and JENKINS-11031.
          In our environment (Jenkins ver. 1.437, Hudson Locks and Latches 0.6) I can consistently reproduce the state where

          • one job hangs indefinitely,
          • this can only be seen on the job's own page, not on the master page or on the slave page. (cf. romberts comment of 19/May/10)
          • restarting the master is necessary to resolve the situation. (seems to be common to all of 6901, 8223, 11031)
          • the same job cannot run again, because it seems to be running already.

          How to reproduce:

          1. have 3 jobs that share a lock, have 2 executors
          It works even if there are only 2 different jobs that do nothing more than "sleep 30" (no SCM required).
          2. start 1 job, then start another 2 jobs
          --> The first running job has acquired the lock, the other two are waiting in the queue.
          3. Once job #1 finishes, the other 2 are simultaneously allocated an executor each.
          4. One of them waits (no output at all), because the other one holds the lock.
          --> This is the safety net as described e.g. in JENKINS-11031, so far everything's fine.
          5. Kill the job waiting-in-executor while the other one is still running.

          Now we're in the strange state where different pages disagree on whether the job is actually running, and resolving the situation requires a master restart.

          Most times, it is possible to avoid step 5 above and wait for one job to finish, upon which the other job will actually start running and eventually terminate correctly. From time to time, however, an imprudent user action will make a certain job being blocked, and we have to restart the master.

          Seems this issue is hard to describe, but not so rare. Any hope of having this fixed?

          Jan Schormann added a comment - I believe this is linked to JENKINS-6901 , JENKINS-8223 , and JENKINS-11031 . In our environment (Jenkins ver. 1.437, Hudson Locks and Latches 0.6) I can consistently reproduce the state where one job hangs indefinitely, this can only be seen on the job's own page, not on the master page or on the slave page. (cf. romberts comment of 19/May/10) restarting the master is necessary to resolve the situation. (seems to be common to all of 6901, 8223, 11031) the same job cannot run again, because it seems to be running already. How to reproduce: 1. have 3 jobs that share a lock, have 2 executors It works even if there are only 2 different jobs that do nothing more than "sleep 30" (no SCM required). 2. start 1 job, then start another 2 jobs --> The first running job has acquired the lock, the other two are waiting in the queue. 3. Once job #1 finishes, the other 2 are simultaneously allocated an executor each. 4. One of them waits (no output at all), because the other one holds the lock. --> This is the safety net as described e.g. in JENKINS-11031 , so far everything's fine. 5. Kill the job waiting-in-executor while the other one is still running. Now we're in the strange state where different pages disagree on whether the job is actually running, and resolving the situation requires a master restart. Most times, it is possible to avoid step 5 above and wait for one job to finish, upon which the other job will actually start running and eventually terminate correctly. From time to time, however, an imprudent user action will make a certain job being blocked, and we have to restart the master. Seems this issue is hard to describe, but not so rare. Any hope of having this fixed?

          We are also seeing similar problem. When a job is canceled manually it seems that it doesn't release the locks it is holding. Any new job which needs that lock just waits forever.
          Any ideas?

          Steven Schwell added a comment - We are also seeing similar problem. When a job is canceled manually it seems that it doesn't release the locks it is holding. Any new job which needs that lock just waits forever. Any ideas?

          Kyle Leinen added a comment -

          I am also seeing this. The job cannot be cleared/cancelled without restarting Jenkins completely, which is a pain.

          Kyle Leinen added a comment - I am also seeing this. The job cannot be cleared/cancelled without restarting Jenkins completely, which is a pain.

          Mark Waite added a comment -

          The Hudson locks and latches plugin is not distributed by the Jenkins update center. It was proposed for deprecation in 2015 and has known blocking issues. It prevents the saving of Jenkins job configurations in Jenkins 2.277.1 and later.

          If someone adopted the plugin, updated it to work with Jenkins 2.277.1 and later, and released a new version, then this issue report could be reopened.

          Mark Waite added a comment - The Hudson locks and latches plugin is not distributed by the Jenkins update center. It was proposed for deprecation in 2015 and has known blocking issues. It prevents the saving of Jenkins job configurations in Jenkins 2.277.1 and later. If someone adopted the plugin, updated it to work with Jenkins 2.277.1 and later, and released a new version, then this issue report could be reopened.

            Unassigned Unassigned
            rombert Robert Munteanu
            Votes:
            6 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated:
              Resolved: