Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-9675

Deleting a Job hangs and blocks all other Perforce activities

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Major Major
    • p4-plugin
    • None
    • Master and Slaves on Linux (RedHat), observed behavior on both Jenkins 1.3999 and 1.411, Perforce Client is Rev. P4/LINUX26X86_64/2010.1/265509 (2010/09/23), Perforce server is P4D/LINUX26X86_64/2010.1/265509 (2010/09/23)

      I'm not sure with what version of the Perforce plugin this started, but currently, using 1.2.5, when we try to delete a job using the UI, it sometimes completes normally, printing the following to the log:

      May 11, 2011 4:48:24 PM hudson.plugins.perforce.PerforceSCM processWorkspaceBeforeDeletion
      INFO: Workspace is being deleted; enabling one-time force sync.
      May 11, 2011 4:48:24 PM hudson.plugins.perforce.PerforceSCM processWorkspaceBeforeDeletion
      INFO: Using remote perforce client: hudson-cjo2011.03-initdb-76240347
      May 11, 2011 4:48:24 PM hudson.plugins.perforce.PerforceSCM processWorkspaceBeforeDeletion
      INFO: [cjo2011.03-initdb] $ p4 workspace -o hudson-cjo2011.03-initdb-76240347
      May 11, 2011 4:48:24 PM hudson.plugins.perforce.PerforceSCM processWorkspaceBeforeDeletion
      INFO: Changing P4 Client Root to: /home/cruise/hudson/workspace/cjo2011.03-initdb
      May 11, 2011 4:48:24 PM hudson.plugins.perforce.PerforceSCM processWorkspaceBeforeDeletion
      INFO: Saving modified client hudson-cjo2011.03-initdb-76240347
      May 11, 2011 4:48:24 PM hudson.plugins.perforce.PerforceSCM processWorkspaceBeforeDeletion
      INFO: [cjo2011.03-initdb] $ p4 -s client -i
      May 11, 2011 4:48:24 PM hudson.plugins.perforce.PerforceSCM processWorkspaceBeforeDeletion
      INFO: [cjo2011.03-initdb] $ p4 sync -k //hudson-cjo2011.03-initdb-76240347/...#0

      But more often it only gets this far:

      May 11, 2011 4:49:13 PM hudson.plugins.perforce.PerforceSCM processWorkspaceBeforeDeletion
      INFO: Workspace is being deleted; enabling one-time force sync.
      May 11, 2011 4:49:13 PM hudson.plugins.perforce.PerforceSCM processWorkspaceBeforeDeletion
      INFO: Using remote perforce client: hudson-cjo2011.03-webservices-e2e-76240347
      May 11, 2011 4:49:13 PM hudson.plugins.perforce.PerforceSCM processWorkspaceBeforeDeletion
      INFO: [cjo2011.03-webservices-e2e] $ p4 workspace -o hudson-cjo2011.03-webservices-e2e-76240347

      At this point the thread handling the request seems to be stuck at:

      Handling POST /view/bugfix/job/cjo2011.03-webservices-e2e/doDelete : RequestHandlerThread62
      java.lang.Object.wait(Native Method)
      hudson.remoting.FastPipedInputStream.read(FastPipedInputStream.java:173)
      ...
      java.io.BufferedReader.readLine(BufferedReader.java:362)
      com.tek42.perforce.parse.AbstractPerforceTemplate.getPerforceResponse(AbstractPerforceTemplate.java:329)
      com.tek42.perforce.parse.AbstractPerforceTemplate.getPerforceResponse(AbstractPerforceTemplate.java:291)
      com.tek42.perforce.parse.Workspaces.getWorkspace(Workspaces.java:53)
      hudson.plugins.perforce.PerforceSCM.getPerforceWorkspace(PerforceSCM.java:1183)
      hudson.plugins.perforce.PerforceSCM.processWorkspaceBeforeDeletion(PerforceSCM.java:2202)

      I've attached the full stack trace from the 'monitoring' page (jenkinsDeleteThread_stackTrace.txt). When this happens, all perforce polling stops. If I kill this thread using the monitor page, perforce activities start up again, but the job that was to be deleted is still present.

      We've also tried deleting a job with the CLI client, and this also hangs, but if we kill the client, the job does get deleted. I've also attached the log output from when this hung client is killed (jenkinsKilledCLIClient_logOutput.txt).

      Let me know if there is any other information I can provide to help debug this.

          [JENKINS-9675] Deleting a Job hangs and blocks all other Perforce activities

          emmulator created issue -

          Rob Petti added a comment - - edited
          1. Does it always hang on the same perforce command, or does it hang on different ones occasionally?
          2. When it hangs, what is the output of
            p4 monitor show -ale
          3. What are the java processes doing cpu/memory wise during a hang? Is it using up more and more memory, or 100% cpu? How about the p4 process?
          4. If you run
            p4 workspace -o hudson-cjo2011.03-webservices-e2e-76240347

            from the slave, does it hang as well? If not, how many lines are returned?

          5. Are your slave agents up to date?
          6. Do you ever experience hangs like this during builds and/or polling normally?
          7. And just to clarify, you are trying to delete the job completely correct? Not just wipe out the workspace?

          Rob Petti added a comment - - edited Does it always hang on the same perforce command, or does it hang on different ones occasionally? When it hangs, what is the output of p4 monitor show -ale What are the java processes doing cpu/memory wise during a hang? Is it using up more and more memory, or 100% cpu? How about the p4 process? If you run p4 workspace -o hudson-cjo2011.03-webservices-e2e-76240347 from the slave, does it hang as well? If not, how many lines are returned? Are your slave agents up to date? Do you ever experience hangs like this during builds and/or polling normally? And just to clarify, you are trying to delete the job completely correct? Not just wipe out the workspace?
          Rob Petti made changes -
          Assignee New: Rob Petti [ rpetti ]

          emmulator added a comment -

          1. Every time I've seen it hang, it was at the same place.
          2-3. Of course, today I can't replicate it, but as soon as I do, I will answer these points. We pull a branch every month, and the Jobs for those monthly branches only need to live for two months before we get rid of them. I created a bunch of new Jobs to delete today, and they all worked. So maybe there's something special about the Jobs for the monthly branches that wasn't replicated by the test Jobs I made today. I will try again next week and let you know as soon as I can replicate the issue again.
          4. I did run the workspace command on the server when we were seeing the hang, and it ouput the workspace description as normal.
          5. Yes, they all have the slave.jar from the same version running on the master.
          6. We've seen 'stuck polling threads' before on the 'Manage Jenkins' page, but neither they nor anything else we've seen blocks all perforce activity in this manner.
          7. Yes, we've only seen this when trying to delete the Job, not when just cleaning the workspace.

          emmulator added a comment - 1. Every time I've seen it hang, it was at the same place. 2-3. Of course, today I can't replicate it, but as soon as I do, I will answer these points. We pull a branch every month, and the Jobs for those monthly branches only need to live for two months before we get rid of them. I created a bunch of new Jobs to delete today, and they all worked. So maybe there's something special about the Jobs for the monthly branches that wasn't replicated by the test Jobs I made today. I will try again next week and let you know as soon as I can replicate the issue again. 4. I did run the workspace command on the server when we were seeing the hang, and it ouput the workspace description as normal. 5. Yes, they all have the slave.jar from the same version running on the master. 6. We've seen 'stuck polling threads' before on the 'Manage Jenkins' page, but neither they nor anything else we've seen blocks all perforce activity in this manner. 7. Yes, we've only seen this when trying to delete the Job, not when just cleaning the workspace.

          Rob Petti added a comment -

          Do all perforce operations hang, or just those from Jenkins?

          Also, do you have any remote depots on your perforce server?

          I'm sort of at a loss as to why this would happen, so I admit that I'm grasping at straws here. It may be time to implement a global timeout/retry/fail on all perforce operations performed from the plugin.

          Rob Petti added a comment - Do all perforce operations hang, or just those from Jenkins? Also, do you have any remote depots on your perforce server? I'm sort of at a loss as to why this would happen, so I admit that I'm grasping at straws here. It may be time to implement a global timeout/retry/fail on all perforce operations performed from the plugin.

          Rob Petti added a comment -

          Is this still occurring for you with the latest p4d updates?

          Rob Petti added a comment - Is this still occurring for you with the latest p4d updates?

          Keith Richardson added a comment - - edited

          Hi,

          I have been facing this problem for some time though. It happens occasionally; I've not been able to reproduce it reliably either.

          Can the perforce plugin implement the global timeout you suggested earlier?

          I am not related to emmulator nor do I know if he still faces this problem.

          FWIW - I have only seen this happen during complete job deletion. Other threads block while trying to acquire a lock already held by the problematic thread (job deletion thread).

          PS. For others which are facing this problem, you can use our workaround which does not require a restart. We interrupt the blocked thread manually using a groovy script. I'll attach it to the defect.

          Tomcat 6.0.29

          Hudson 1.395
          Slave type - SSH Slave 0.14
          Perforce plugin 1.2.2
          P4 Server - 2010.1/251161
          P4 Client - Rev. P4/LINUX26X86/2007.3/143793 (2008/01/09).

          Full stack trace is attached as kr1chard.stack.trace

          Keith Richardson added a comment - - edited Hi, I have been facing this problem for some time though. It happens occasionally; I've not been able to reproduce it reliably either. Can the perforce plugin implement the global timeout you suggested earlier? I am not related to emmulator nor do I know if he still faces this problem. FWIW - I have only seen this happen during complete job deletion. Other threads block while trying to acquire a lock already held by the problematic thread (job deletion thread). PS. For others which are facing this problem, you can use our workaround which does not require a restart. We interrupt the blocked thread manually using a groovy script. I'll attach it to the defect. Tomcat 6.0.29 Hudson 1.395 Slave type - SSH Slave 0.14 Perforce plugin 1.2.2 P4 Server - 2010.1/251161 P4 Client - Rev. P4/LINUX26X86/2007.3/143793 (2008/01/09). Full stack trace is attached as kr1chard.stack.trace
          Keith Richardson made changes -
          Attachment New: interruptThreadByName.groovy [ 20663 ]
          Attachment New: kr1chard.stack.trace [ 20664 ]

          Rob Petti added a comment -

          Does this block ALL perforce operations, or just the ones from Jenkins? In other words, does the perforce server stop responding, or continue to work normally?

          Rob Petti added a comment - Does this block ALL perforce operations, or just the ones from Jenkins? In other words, does the perforce server stop responding, or continue to work normally?

          Rob Petti added a comment -

          Also, your perforce plugin version is quite out of date. Would you mind trying the latest?

          Rob Petti added a comment - Also, your perforce plugin version is quite out of date. Would you mind trying the latest?

            Unassigned Unassigned
            emmulator emmulator
            Votes:
            1 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated: