Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-27329

WorkspaceCleanupThread may delete workspaces of running jobs

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Major Major
    • core
    • Linux host, Linux, OSX, Windows, slaves. Jenkins version 1.602.

      The problem is as described in JENKINS-4501. As requested in JENKINS-4501, I am creating a new issue as this problem still exists in 1.602.

      In short, Jenkins silently and erroneously deletes workspaces on slaves for matrix projects that are not old.

      Over the course of the time I've worked with Jenkins this behavior has created literally days of work and waiting on very long running builds that rely on cached workspaces to be manageable. It's cost me more hours again today after restoring jenkins to a new server after a hardware failure. This setting was reset since it exists outside normal recommended backup files and I didn't think to add it when I "fixed" this last time.

      Would it not be easier to have hudson.model.WorkspaceCleanupThread.disabled default to true? Having the default behavior be "destroy my data" seems bad, especially with how cheap disk is now. I'm sure when this option was implemented it made a lot of sense, but when I can get a 1TB for $50, it just seems wrong-headed. Let the fallow workspaces lie. I can clean them up if I need to.

      If that's not an acceptable solution, could it not be moved to a config location in the Jenkins home? That way we can be relatively sure that the setting will be propagated in backups and not bite someone who thought they solved this problem and had forgotten about it?

          [JENKINS-27329] WorkspaceCleanupThread may delete workspaces of running jobs

          Quentin Hartman created issue -
          Daniel Beck made changes -
          Component/s New: matrix-project-plugin [ 18765 ]

          >> Jenkins silently and erroneously deletes workspaces on slaves for matrix projects that are not old
          We just hit this problem (or what appears to be this problem) as well.

          An extract from $JENKINS_HOME/Workspace clean-up.log:

          Deleting /Users/dlshudson/jenkins_slave/workspace/dials_distribute on dials-mac-mini
          Deleting /scratch/jenkins_slave/workspace/dials_distribute on dials-ws133
          Deleting /scratch/jenkins_slave/workspace/dials_distribute on dials-ws154
          

          side-note: it's a shame those log lines are not time-stamped
          The 3 mentioned workspaces are from a matrix project, and all workspaces had been accessed recently.

          Presumably there is a bug in the workspace cleanup code that means it does not handle matrix projects correctly.

          Note that the job configuration specifies 4 slaves: 1 by label, and 3 by individual nodes. The workspaces that were deleted were those on the 3 slaves that were specified as individual notes, but the workspace on the slave that was specified by label was not deleted. Possibly a clue o the bug?

          The workaround is to set hudson.model.WorkspaceCleanupThread.disabled=true.

          Matthew Webber added a comment - >> Jenkins silently and erroneously deletes workspaces on slaves for matrix projects that are not old We just hit this problem (or what appears to be this problem) as well. An extract from $JENKINS_HOME/Workspace clean-up.log : Deleting /Users/dlshudson/jenkins_slave/workspace/dials_distribute on dials-mac-mini Deleting /scratch/jenkins_slave/workspace/dials_distribute on dials-ws133 Deleting /scratch/jenkins_slave/workspace/dials_distribute on dials-ws154 side-note : it's a shame those log lines are not time-stamped The 3 mentioned workspaces are from a matrix project, and all workspaces had been accessed recently. Presumably there is a bug in the workspace cleanup code that means it does not handle matrix projects correctly. Note that the job configuration specifies 4 slaves: 1 by label, and 3 by individual nodes. The workspaces that were deleted were those on the 3 slaves that were specified as individual notes, but the workspace on the slave that was specified by label was not deleted. Possibly a clue o the bug? The workaround is to set hudson.model.WorkspaceCleanupThread.disabled=true .

          Daniel knows about this area, so assigning to him for comment (sorry, Daniel!)

          Matthew Webber added a comment - Daniel knows about this area, so assigning to him for comment (sorry, Daniel!)
          Matthew Webber made changes -
          Assignee New: Daniel Beck [ danielbeck ]
          Daniel Beck made changes -
          Link New: This issue is duplicated by JENKINS-30916 [ JENKINS-30916 ]

          Ingo Weinhold added a comment -

          Since JENKINS-30916 has been closed as a duplicate: Here the ticket description only says that workspaces that aren't old are deleted. In fact a workspace can even be deleted while a build using the workspace is in progress. The lines from the system log for such a case:

          Okt 13, 2015 3:29:27 AM INFORMATION hudson.slaves.CommandLauncher launch
          slave agent launched for BonefishMac-Ubuntu-12.04
          Okt 13, 2015 3:31:15 AM INFORMATION hudson.model.AsyncPeriodicWork$1 run
          Started Workspace clean-up
          Okt 13, 2015 3:31:21 AM INFORMATION hudson.model.Run execute
          Bar-Nightly/label=Ubuntu-12.04 #222 main build action completed: FAILURE
          

          Ingo Weinhold added a comment - Since JENKINS-30916 has been closed as a duplicate: Here the ticket description only says that workspaces that aren't old are deleted. In fact a workspace can even be deleted while a build using the workspace is in progress. The lines from the system log for such a case: Okt 13, 2015 3:29:27 AM INFORMATION hudson.slaves.CommandLauncher launch slave agent launched for BonefishMac-Ubuntu-12.04 Okt 13, 2015 3:31:15 AM INFORMATION hudson.model.AsyncPeriodicWork$1 run Started Workspace clean-up Okt 13, 2015 3:31:21 AM INFORMATION hudson.model.Run execute Bar-Nightly/label=Ubuntu-12.04 #222 main build action completed: FAILURE

          Daniel Beck added a comment -

          bonefish Same reason, workspace cleanup uses the root workspace directory modification date to determine whether it's old. As matrix jobs only build in subdirectories (corresponding to axes), it's trivial for these to appear unmodified for a long time.

          Daniel Beck added a comment - bonefish Same reason, workspace cleanup uses the root workspace directory modification date to determine whether it's old. As matrix jobs only build in subdirectories (corresponding to axes), it's trivial for these to appear unmodified for a long time.
          R. Tyler Croy made changes -
          Workflow Original: JNJira [ 161534 ] New: JNJira + In-Review [ 180734 ]
          Daniel Beck made changes -
          Assignee Original: Daniel Beck [ danielbeck ]

            Unassigned Unassigned
            qhartman Quentin Hartman
            Votes:
            13 Vote for this issue
            Watchers:
            20 Start watching this issue

              Created:
              Updated:
              Resolved: