Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-22395

Run.delete (from LogRotator) failing with "...looks to have already been deleted"

      Sometimes "log rotation" fails with an exception like

      SEVERE  hudson.model.Run#execute: Failed to rotate log
      java.io.IOException: .../jobs/.../modules/...$.../builds/2013-... looks to have already been deleted
          at hudson.model.Run.delete(Run.java:1432)
          at hudson.maven.MavenModuleSetBuild.delete(MavenModuleSetBuild.java:420)
          at hudson.tasks.LogRotator.perform(LogRotator.java:136)
          at hudson.model.Job.logRotate(Job.java:437)
          at hudson.maven.MavenModuleSet.logRotate(MavenModuleSet.java:851)
          at hudson.model.Run.execute(Run.java:1728)
          at hudson.maven.MavenModuleSetBuild.run(MavenModuleSetBuild.java:509)
          at hudson.model.ResourceController.execute(ResourceController.java:88)
          at hudson.model.Executor.run(Executor.java:246)
      

      Usually MavenModuleSetBuild is involved, perhaps suggesting a problem with deleted or skipped module builds (I cannot reproduce in either scenario), though I think I have also seen this happen for freestyle builds. Unclear whether the directory actually exists but File.isDirectory thinks it does not (the code originally tried to delete the dir without checking this and failed with "... is in use"); or whether the directory was actually deleted earlier but this Run was not cleaned up properly for some reason.

      Hypothesis: AbstractLazyLoadRunMap.idOnDisk is not removed by removeValue, called from AbstractProject.removeRun (from Run.delete). Perhaps something is later resurrecting the AbstractBuild from idOnDisk, making it again available for another round of log rotation? But if the actual directory was deleted, it is hard to see how: load would just fail.

      Or perhaps Run.delete is the race condition: the run is removed from the parent after its directory has been deleted. The method is synchronized, but that does not help if two copies of the Run exist, which might happen due to other lazy-loading bugs.

      Diagnostics added to date:

          [JENKINS-22395] Run.delete (from LogRotator) failing with "...looks to have already been deleted"

          Jesse Glick created issue -

          Jesse Glick added a comment -

          Not to be confused with JENKINS-19377, which had a similar stack trace but was easily reproducible and limited to external monitor jobs.

          Jesse Glick added a comment - Not to be confused with JENKINS-19377 , which had a similar stack trace but was easily reproducible and limited to external monitor jobs.
          Jesse Glick made changes -
          Link New: This issue is related to JENKINS-19377 [ JENKINS-19377 ]
          Jesse Glick made changes -
          Link New: This issue is related to JENKINS-17508 [ JENKINS-17508 ]
          Jesse Glick made changes -
          Link New: This issue is related to JENKINS-17553 [ JENKINS-17553 ]

          Jesse Glick added a comment - Diagnostic build: http://repo.jenkins-ci.org/public/org/jenkins-ci/main/jenkins-war/1.532.2.JENKINS-22395-diag/jenkins-war-1.532.2.JENKINS-22395-diag.war

          Jesse Glick added a comment -

          Enhanced logging suggests that when MavenModuleSetBuild.delete leads to this error, it is deleting a MavenModuleBuild with the same number (successfully), and then (via getModuleBuilds) one with a later number that had already been deleted during an earlier MavenModuleSetBuild.delete. In other words, AbstractBuild.getNextBuild is returning a build which was already (recently) deleted from its parent! That suggests that dropLinks failed to actually drop the link from the older build—perhaps because the referent of previousBuildR in the newer build was null, perhaps because the older build was collected and then recreated.

          Jesse Glick added a comment - Enhanced logging suggests that when MavenModuleSetBuild.delete leads to this error, it is deleting a MavenModuleBuild with the same number (successfully), and then (via getModuleBuilds ) one with a later number that had already been deleted during an earlier MavenModuleSetBuild.delete . In other words, AbstractBuild.getNextBuild is returning a build which was already (recently) deleted from its parent! That suggests that dropLinks failed to actually drop the link from the older build—perhaps because the referent of previousBuildR in the newer build was null, perhaps because the older build was collected and then recreated.
          Jesse Glick made changes -
          Status Original: Open [ 1 ] New: In Progress [ 3 ]
          Jesse Glick made changes -
          Labels Original: exception log-rotator New: exception lazy-loading log-rotator
          Jesse Glick made changes -

            jglick Jesse Glick
            jglick Jesse Glick
            Votes:
            2 Vote for this issue
            Watchers:
            10 Start watching this issue

              Created:
              Updated:
              Resolved: