Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-37121

WorkspaceListLeasePickle should help diagnose locked workspaces

    XMLWordPrintable

Details

    Description

      "Waiting to acquire /.../workspace/... : jenkins.util.Timer [#...]" id=... (0x...) state=WAITING cpu=75%
          - waiting on <0x...> (a hudson.slaves.WorkspaceList)
          - locked <0x...> (a hudson.slaves.WorkspaceList)
          at java.lang.Object.wait(Native Method)
          at java.lang.Object.wait(Object.java:502)
          at hudson.slaves.WorkspaceList.acquire(WorkspaceList.java:255)
          at hudson.slaves.WorkspaceList.acquire(WorkspaceList.java:234)
          at hudson.slaves.WorkspaceList.acquire(WorkspaceList.java:223)
          at org.jenkinsci.plugins.workflow.support.pickles.WorkspaceListLeasePickle$1.tryResolve(WorkspaceListLeasePickle.java:67)
          at org.jenkinsci.plugins.workflow.support.pickles.WorkspaceListLeasePickle$1.tryResolve(WorkspaceListLeasePickle.java:52)
          at org.jenkinsci.plugins.workflow.support.pickles.TryRepeatedly$1.run(TryRepeatedly.java:62)
          at ...
      

      Looks like a workspace did not get relocked fast enough to avoid getting grabbed by some other job?

      At a minimum, WorkspaceListLeasePickle should printWaitingMessage when acquire blocks, so it is clearer from the build log why the build is still stuck.

      Possibly it should fail if it cannot acquire the workspace immediately, since in this case the workspace can be assumed to have already been clobbered by something else. Currently there is no such core API.

      Attachments

        Issue Links

          Activity

            swapnilpatne Swapnil Patne added a comment - - edited

            Getting this error today and strangely it's not even listing this build in left pane in history.

            Jenkins ver. 2.107.2

            java.lang.IllegalStateException: JENKINS-37121: something already locked /var/lib/jenkins/workspace/AutomationPipeline@12
             at org.jenkinsci.plugins.workflow.support.pickles.WorkspaceListLeasePickle$1.tryResolve(WorkspaceListLeasePickle.java:75)
             at org.jenkinsci.plugins.workflow.support.pickles.WorkspaceListLeasePickle$1.tryResolve(WorkspaceListLeasePickle.java:51)
             at org.jenkinsci.plugins.workflow.support.pickles.TryRepeatedly$1.run(TryRepeatedly.java:92)
             at jenkins.security.ImpersonatingScheduledExecutorService$1.run(ImpersonatingScheduledExecutorService.java:58)
             at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
             at java.util.concurrent.FutureTask.run(FutureTask.java:266)
             at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
             at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
             Caused: java.io.IOException: Failed to load build state
             at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution$3.onSuccess(CpsFlowExecution.java:842)
             at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution$3.onSuccess(CpsFlowExecution.java:840)
             at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution$4$1.run(CpsFlowExecution.java:894)
             at org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService$1.run(CpsVmExecutorService.java:35)
             at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
             at java.util.concurrent.FutureTask.run(FutureTask.java:266)
             at hudson.remoting.SingleLaneExecutorService$1.run(SingleLaneExecutorService.java:131)
             at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
             at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
             at java.util.concurrent.FutureTask.run(FutureTask.java:266)
             at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
             at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
             at java.lang.Thread.run(Thread.java:748)
             Finished: FAILURE

             

            swapnilpatne Swapnil Patne added a comment - - edited Getting this error today and strangely it's not even listing this build in left pane in history. Jenkins ver. 2.107.2 java.lang.IllegalStateException: JENKINS-37121: something already locked / var /lib/jenkins/workspace/AutomationPipeline@12 at org.jenkinsci.plugins.workflow.support.pickles.WorkspaceListLeasePickle$1.tryResolve(WorkspaceListLeasePickle.java:75) at org.jenkinsci.plugins.workflow.support.pickles.WorkspaceListLeasePickle$1.tryResolve(WorkspaceListLeasePickle.java:51) at org.jenkinsci.plugins.workflow.support.pickles.TryRepeatedly$1.run(TryRepeatedly.java:92) at jenkins.security.ImpersonatingScheduledExecutorService$1.run(ImpersonatingScheduledExecutorService.java:58) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) Caused: java.io.IOException: Failed to load build state at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution$3.onSuccess(CpsFlowExecution.java:842) at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution$3.onSuccess(CpsFlowExecution.java:840) at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution$4$1.run(CpsFlowExecution.java:894) at org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService$1.run(CpsVmExecutorService.java:35) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at hudson.remoting.SingleLaneExecutorService$1.run(SingleLaneExecutorService.java:131) at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang. Thread .run( Thread .java:748) Finished: FAILURE  
            akmjenkins ASHOK MOHANTY added a comment -

            We are still getting this error with - Jenkins 2.190.3 and Kub-Plugin v.1.18.3

             

            java.lang.IllegalStateException: JENKINS-37121: something already locked 
            

            I have updated the details in JENKINS-38994, Please let me know - if need to follow any other ticket(s) !!

             

            akmjenkins ASHOK MOHANTY added a comment - We are still getting this error with -  Jenkins 2.190.3 and Kub-Plugin v. 1.18.3   java.lang.IllegalStateException: JENKINS-37121: something already locked I have updated the details in JENKINS-38994 , Please let me know - if need to follow any other ticket(s) !!  
            basil Basil Crow added a comment -

            For what it's worth, I get this every month or two:

            14:08:01  java.lang.IllegalStateException: JENKINS-37121: something already locked /path/to/job1
            14:08:01  	at org.jenkinsci.plugins.workflow.support.pickles.WorkspaceListLeasePickle$1.tryResolve(WorkspaceListLeasePickle.java:73)
            14:08:01  	at org.jenkinsci.plugins.workflow.support.pickles.WorkspaceListLeasePickle$1.tryResolve(WorkspaceListLeasePickle.java:50)
            14:08:01  	at org.jenkinsci.plugins.workflow.support.pickles.TryRepeatedly$1.run(TryRepeatedly.java:92)
            14:08:01  	at jenkins.security.ImpersonatingScheduledExecutorService$1.run(ImpersonatingScheduledExecutorService.java:67)
            14:08:01  	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
            14:08:01  	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
            14:08:01  	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
            14:08:01  	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
            14:08:01  Caused: java.io.IOException: Failed to load build state
            14:08:01  	at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution$3.onSuccess(CpsFlowExecution.java:865)
            14:08:01  	at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution$3.onSuccess(CpsFlowExecution.java:863)
            14:08:01  	at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution$4$1.run(CpsFlowExecution.java:917)
            14:08:01  	at org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService$1.run(CpsVmExecutorService.java:38)
            14:08:01  	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
            14:08:01  	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
            14:08:01  	at hudson.remoting.SingleLaneExecutorService$1.run(SingleLaneExecutorService.java:139)
            14:08:01  	at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
            14:08:01  	at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:68)
            14:08:01  	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
            14:08:01  	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
            14:08:01  	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
            14:08:01  	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
            14:08:01  	at java.lang.Thread.run(Thread.java:748)
            14:08:01  Finished: FAILURE 
            

            The conditions are always the same. About half an hour prior, a regularly scheduled Job DSL "Generate Jobs" step starts:

            13:37:17  Processing DSL script jenkins/jobs/job1.groovy
            14:08:22  Processing DSL script jenkins/jobs/job2.groovy
            

            Note the timestamps: this "Generate Jobs" step is taking a very long time! job1 has thousands of builds in the history, and the storage is a slow NFS server. So this just takes forever. One time I caught a jstack of it and it was stuck in the I/O path in AbstractLazyLoadRunMap or similar.

            Once the DSL script for job1 is processed, we move on to job2 at 14:08, and that's when a few (though not all) runs of job1 blow up with the above stack trace.

            If there's anything else I can do to gather debug information, I'd be happy to.

            Yes I know the NFS server is really the issue, but fixing that is out of my control at the present time for organizational reasons.

            basil Basil Crow added a comment - For what it's worth, I get this every month or two: 14:08:01 java.lang.IllegalStateException: JENKINS-37121: something already locked /path/to/job1 14:08:01 at org.jenkinsci.plugins.workflow.support.pickles.WorkspaceListLeasePickle$1.tryResolve(WorkspaceListLeasePickle.java:73) 14:08:01 at org.jenkinsci.plugins.workflow.support.pickles.WorkspaceListLeasePickle$1.tryResolve(WorkspaceListLeasePickle.java:50) 14:08:01 at org.jenkinsci.plugins.workflow.support.pickles.TryRepeatedly$1.run(TryRepeatedly.java:92) 14:08:01 at jenkins.security.ImpersonatingScheduledExecutorService$1.run(ImpersonatingScheduledExecutorService.java:67) 14:08:01 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 14:08:01 at java.util.concurrent.FutureTask.run(FutureTask.java:266) 14:08:01 at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) 14:08:01 at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) 14:08:01 Caused: java.io.IOException: Failed to load build state 14:08:01 at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution$3.onSuccess(CpsFlowExecution.java:865) 14:08:01 at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution$3.onSuccess(CpsFlowExecution.java:863) 14:08:01 at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution$4$1.run(CpsFlowExecution.java:917) 14:08:01 at org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService$1.run(CpsVmExecutorService.java:38) 14:08:01 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 14:08:01 at java.util.concurrent.FutureTask.run(FutureTask.java:266) 14:08:01 at hudson.remoting.SingleLaneExecutorService$1.run(SingleLaneExecutorService.java:139) 14:08:01 at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28) 14:08:01 at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:68) 14:08:01 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 14:08:01 at java.util.concurrent.FutureTask.run(FutureTask.java:266) 14:08:01 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 14:08:01 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 14:08:01 at java.lang.Thread.run(Thread.java:748) 14:08:01 Finished: FAILURE The conditions are always the same. About half an hour prior, a regularly scheduled Job DSL "Generate Jobs" step starts: 13:37:17 Processing DSL script jenkins/jobs/job1.groovy 14:08:22 Processing DSL script jenkins/jobs/job2.groovy Note the timestamps: this "Generate Jobs" step is taking a very long time! job1 has thousands of builds in the history, and the storage is a slow NFS server. So this just takes forever. One time I caught a jstack of it and it was stuck in the I/O path in AbstractLazyLoadRunMap or similar. Once the DSL script for job1 is processed, we move on to job2 at 14:08, and that's when a few (though not all) runs of job1 blow up with the above stack trace. If there's anything else I can do to gather debug information, I'd be happy to. Yes I know the NFS server is really the issue, but fixing that is out of my control at the present time for organizational reasons.
            jglick Jesse Glick added a comment -

            No hypothesis offhand. Possibly needs a core patch to record the stack trace and other metadata of the original locker.

            jglick Jesse Glick added a comment - No hypothesis offhand. Possibly needs a core patch to record the stack trace and other metadata of the original locker.
            eradchenko Evgeny added a comment -

            We have this issue after update to Jenkins 2.289.1, after when we generated multibanch pipeline by rest api.

            Workraound fix for me:

            edit

            /var/lib/jenkins/org.jenkins.plugins.lockableresources.LockableResourcesManager.xml

            add <queuingStarted>0</queuingStarted>

            restart jenkins

             

            eradchenko Evgeny added a comment - We have this issue after update to Jenkins 2.289.1, after when we generated multibanch pipeline by rest api. Workraound fix for me: edit /var/lib/jenkins/org.jenkins.plugins.lockableresources.LockableResourcesManager.xml add <queuingStarted>0</queuingStarted> restart jenkins  

            People

              jglick Jesse Glick
              jglick Jesse Glick
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: