Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-19244

jenkins random hang during startup - Solaris and Linux

    • Icon: Bug Bug
    • Resolution: Cannot Reproduce
    • Icon: Major Major
    • core

      Hi,

      We are having intermittent issues starting Jenkins on our Solaris machine. About 3 times out of 5 restarts, Jenkins does not go passed the 'INFO Augmented all extensions' line and just hangs. I have attached 2 different jstack dumps hoping that this can help troubleshoot what is going on. It looks as if hudson.Util.resolveSymlink is involved, but no idea what to look for further. If needed i can truss the process but that gives a lot of output so you'll need to tell me a bit more in detail what to look for.

      jmap -help does not seem to indicate a problem:

      Attaching to process ID 20951, please wait...
      Debugger attached successfully.
      Server compiler detected.
      JVM version is 20.2-b06
      
      using thread-local object allocation.
      Parallel GC with 43 thread(s)
      
      Heap Configuration:
         MinHeapFreeRatio = 40
         MaxHeapFreeRatio = 70
         MaxHeapSize      = 2147483648 (2048.0MB)
         NewSize          = 1048576 (1.0MB)
         MaxNewSize       = 4294901760 (4095.9375MB)
         OldSize          = 4194304 (4.0MB)
         NewRatio         = 2
         SurvivorRatio    = 8
         PermSize         = 134217728 (128.0MB)
         MaxPermSize      = 268435456 (256.0MB)
      
      Heap Usage:
      PS Young Generation
      Eden Space:
         capacity = 537919488 (513.0MB)
         used     = 270585256 (258.05020904541016MB)
         free     = 267334232 (254.94979095458984MB)
         50.30218499910529% used
      From Space:
         capacity = 89653248 (85.5MB)
         used     = 51440608 (49.057586669921875MB)
         free     = 38212640 (36.442413330078125MB)
         57.37729435078582% used
      To Space:
         capacity = 89653248 (85.5MB)
         used     = 0 (0.0MB)
         free     = 89653248 (85.5MB)
         0.0% used
      PS Old Generation
         capacity = 1434451968 (1368.0MB)
         used     = 0 (0.0MB)
         free     = 1434451968 (1368.0MB)
         0.0% used
      PS Perm Generation
         capacity = 134217728 (128.0MB)
         used     = 94797032 (90.4054946899414MB)
         free     = 39420696 (37.594505310058594MB)
         70.62929272651672% used
      

          [JENKINS-19244] jenkins random hang during startup - Solaris and Linux

          Code changed in jenkins
          User: Mirko Friedenhagen
          Path:
          src/test/java/hudson/plugins/jobConfigHistory/AbstractHudsonTestCaseDeletingInstanceDir.java
          http://jenkins-ci.org/commit/jobConfigHistory-plugin/ae6f95f768644539649aed7bf77cfe04ec19deb5
          Log:
          Possible workaround for https://issues.jenkins-ci.org/browse/JENKINS-19244
          Jenkins random hang during startup - Solaris and Linux

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Mirko Friedenhagen Path: src/test/java/hudson/plugins/jobConfigHistory/AbstractHudsonTestCaseDeletingInstanceDir.java http://jenkins-ci.org/commit/jobConfigHistory-plugin/ae6f95f768644539649aed7bf77cfe04ec19deb5 Log: Possible workaround for https://issues.jenkins-ci.org/browse/JENKINS-19244 Jenkins random hang during startup - Solaris and Linux

          Might not be related but...

          Our jenkins server used to take 15 minutes to load jobs on startup after INFO: Augmented all extensions
          Last week it failed to start at all.

          I read the the Downstream build view plugin causes jenkins to load all jobs at startup so i disabled it.

          I then found that Jenkins Javascript Widgets Plugin seems to load all jobs and build on startup so I disabled it

          Jenkins restart is a lot quicker now!

          Geoff Cummings added a comment - Might not be related but... Our jenkins server used to take 15 minutes to load jobs on startup after INFO: Augmented all extensions Last week it failed to start at all. I read the the Downstream build view plugin causes jenkins to load all jobs at startup so i disabled it. I then found that Jenkins Javascript Widgets Plugin seems to load all jobs and build on startup so I disabled it Jenkins restart is a lot quicker now!

          Just for the record: -Dhudson.model.parallelLoad=false fixed this for the integration-tests of jobConfigHistory-plugin.

          Mirko Friedenhagen added a comment - Just for the record: -Dhudson.model.parallelLoad=false fixed this for the integration-tests of jobConfigHistory-plugin.

          Jesse Glick added a comment -

          @mfriedenhagen your deadlock is reported as JENKINS-20988; not related I think.

          Jesse Glick added a comment - @mfriedenhagen your deadlock is reported as JENKINS-20988 ; not related I think.

          XiVO DeV TEAM added a comment -

          We had the same problem: Jenkins randomly (but very often) hangs at startup during the loading of jobs.
          Adding the option -Dhudson.model.parallelLoad=false did not solve the problem.
          Looking at the jtrace, we saw one thread blocking on the plugin "Throttle Concurrent Builds Plugin".
          We fixed the problem by downgrading the plugin "Throttle Concurrent Builds Plugin" from 1.8.1 to 1.8.

          XiVO DeV TEAM added a comment - We had the same problem: Jenkins randomly (but very often) hangs at startup during the loading of jobs. Adding the option -Dhudson.model.parallelLoad=false did not solve the problem. Looking at the jtrace, we saw one thread blocking on the plugin "Throttle Concurrent Builds Plugin". We fixed the problem by downgrading the plugin "Throttle Concurrent Builds Plugin" from 1.8.1 to 1.8.

          Same problem but jenkins hangs every times.
          Jekins : 1.553, 1.552
          throttle-concurrents : 1.8.1
          #jobs : ~300
          JDK: 1.7 or 1.6

          Thank's for all the comments that helped a lot.

          I followed the advises here :

          • disable throttle-concurrents (touch jenkins/plugins/throttle-concurrents.jpi.disabled)
          • downgrade jenkins to the latest (1.549) I ran successfully (thanks to the backups in /usr/lib/jenkins)
          • move the "jobs" directory so that jenkins startup without jobs (mv /usr/lib/jenkins/jobs /usr/lib/jenkins/jobs_)
          • move back the "jobs" directory after jenkins is up and running (mv /usr/lib/jenkins/jobs_ /usr/lib/jenkins/jobs)
          • go to admin and trigger the "reload config" action

          Sylvain Mougenot added a comment - Same problem but jenkins hangs every times. Jekins : 1.553, 1.552 throttle-concurrents : 1.8.1 #jobs : ~300 JDK: 1.7 or 1.6 Thank's for all the comments that helped a lot. I followed the advises here : disable throttle-concurrents (touch jenkins/plugins/throttle-concurrents.jpi.disabled) downgrade jenkins to the latest (1.549) I ran successfully (thanks to the backups in /usr/lib/jenkins) move the "jobs" directory so that jenkins startup without jobs (mv /usr/lib/jenkins/jobs /usr/lib/jenkins/jobs_) move back the "jobs" directory after jenkins is up and running (mv /usr/lib/jenkins/jobs_ /usr/lib/jenkins/jobs) go to admin and trigger the "reload config" action

          We have the same problem on 1.554. We don't use throttle-concurrents.
          Moving the jobs to a backup directory and back after start-up works.
          Adding -Dhudson.model.Hudson.parallelLoad=false did not.
          Our configuration has 80 maven builds, none of which are matrix jobs.

          Ben Ketteridge added a comment - We have the same problem on 1.554. We don't use throttle-concurrents. Moving the jobs to a backup directory and back after start-up works. Adding -Dhudson.model.Hudson.parallelLoad=false did not. Our configuration has 80 maven builds, none of which are matrix jobs.

          Having the same issue. Jenkins startup on Solaris hangs after 'Augmented all extensions'. Removing all jobs works (103 jobs). I tried adding back the jobs one at a time, and it only seems to be an issue when there are more than 5 or 6 jobs.

          -Dhudson.model.Hudson.parallelLoad=false did solve the problem, so it seems to be an issue with the parallel load when there are more than a small number of jobs.

          Michael Whidden added a comment - Having the same issue. Jenkins startup on Solaris hangs after 'Augmented all extensions'. Removing all jobs works (103 jobs). I tried adding back the jobs one at a time, and it only seems to be an issue when there are more than 5 or 6 jobs. -Dhudson.model.Hudson.parallelLoad=false did solve the problem, so it seems to be an issue with the parallel load when there are more than a small number of jobs.

          Daniel Beck added a comment -

          Is this still an issue on recent Jenkins versions?

          If so, please post comments with updated information. Be specific, and include at least the following:

          • Which Solaris version
          • Which JRE in which version
          • Which Jenkins version, which plugins in which versions
          • If you tried removing plugins one by one, which did you find to be the culprit?
          • Please post your thread dumps when it hangs. (https://wiki.jenkins-ci.org/display/JENKINS/Obtaining+a+thread+dump)
          • Try all of the workarounds posted in the comments (so far: uninstall/downgrade Throwttle Concurrent Builds, -Dhudson.model.parallelLoad=false) and report their results

          Daniel Beck added a comment - Is this still an issue on recent Jenkins versions? If so, please post comments with updated information. Be specific, and include at least the following: Which Solaris version Which JRE in which version Which Jenkins version, which plugins in which versions If you tried removing plugins one by one, which did you find to be the culprit? Please post your thread dumps when it hangs. ( https://wiki.jenkins-ci.org/display/JENKINS/Obtaining+a+thread+dump ) Try all of the workarounds posted in the comments (so far: uninstall/downgrade Throwttle Concurrent Builds, -Dhudson.model.parallelLoad=false) and report their results

          Daniel Beck added a comment -

          No response to comment asking for updated and additional information in over a month, so resolving as Cannot Reproduce.

          Please file a new issue when something like this happens to you on recent Jenkins versions, and link back to this one as possibly related. Make sure to follow the advice on: https://wiki.jenkins-ci.org/display/JENKINS/How+to+report+an+issue

          Daniel Beck added a comment - No response to comment asking for updated and additional information in over a month, so resolving as Cannot Reproduce. Please file a new issue when something like this happens to you on recent Jenkins versions, and link back to this one as possibly related. Make sure to follow the advice on: https://wiki.jenkins-ci.org/display/JENKINS/How+to+report+an+issue

            Unassigned Unassigned
            heymjo Jorg Heymans
            Votes:
            8 Vote for this issue
            Watchers:
            16 Start watching this issue

              Created:
              Updated:
              Resolved: