After several similar situations in the past weeks like this one where concurrent jobs using the same lock I think the problem may be mostly between chair and keyboard, but also due to lack of informative messages.
Lets assume 3 jobs A, B and C using the same lock, the following timeline occurs:
- A is already running (so it has the lock) and we start B and C.
- B and C show in queue stating they are being blocked by A
- A finishes
- B and C already waited the 'delay' so they are ready to run, so they do run at the same time
- B gets the lock so it states it in the log is activelly running
- C on the other hand didn't get the lock, but is running anyway and showing no output that it is actively waiting for the lock
- Users panic when they see a job running for 2 days without any output in the log and cancel the C
- C get's inconsistent state and starts showing "null" instead of real dates
- Jenkins needs to be restarted to get rid of C because there is no way to kill it
Also we observed that if we don't Cancel C eventually it starts executing normally when B finishes and showing it has acquired the lock in the log file.
Bottom line the problem is lack of informative messages from the plugin to provide the proper feedback to users so they don't panic.
Workaround: Revoke users permissions to Cancel builds that are using locks.