Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-16879

More robust display detection needed - builds fail when many builds require Xvnc

    XMLWordPrintable

Details

    • Bug
    • Status: Resolved (View Workflow)
    • Major
    • Resolution: Fixed
    • xvnc-plugin
    • None
    • Ubuntu 12.04, Jenkins 1.500, XVNC plugin 1.10

    Description

      We're having issues with failing builds. We're running several builds in parallel on the same Jenkins machine, and many of them use XVNC, and I'm assuming it's related to this.

      Builds fail several times a day with the following error (though displays might differ, obviously):

      Starting xvnc
      [my-build] $ /usr/bin/vncserver :37 -geometry 1920x1280
      A VNC server is already running as :37
      Starting xvnc
      [my-build] $ /usr/bin/vncserver :49 -geometry 1920x1280
      A VNC server is already running as :49
      Starting xvnc
      [my-build] $ /usr/bin/vncserver :50 -geometry 1920x1280
      A VNC server is already running as :50
      Starting xvnc
      [my-build] $ /usr/bin/vncserver :51 -geometry 1920x1280
      A VNC server is already running as :51
      FATAL: Failed to run '/usr/bin/vncserver :51 -geometry 1920x1280' (exit code 98), blacklisting display #51; consider checking the "Clean up before start" option
      java.io.IOException: Failed to run '/usr/bin/vncserver :51 -geometry 1920x1280' (exit code 98), blacklisting display #51; consider checking the "Clean up before start" option
      	at hudson.plugins.xvnc.Xvnc.doSetUp(Xvnc.java:100)
      	at hudson.plugins.xvnc.Xvnc.doSetUp(Xvnc.java:98)
      	at hudson.plugins.xvnc.Xvnc.doSetUp(Xvnc.java:98)
      	at hudson.plugins.xvnc.Xvnc.doSetUp(Xvnc.java:98)
      	at hudson.plugins.xvnc.Xvnc.setUp(Xvnc.java:73)
      	at hudson.model.Build$BuildExecution.doRun(Build.java:154)
      	at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:592)
      	at hudson.model.Run.execute(Run.java:1557)
      	at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
      	at hudson.model.ResourceController.execute(ResourceController.java:88)
      	at hudson.model.Executor.run(Executor.java:236)
      

      To me it seems like displays in use are blacklisted, and this is undesired since we currently don't have any stale locks in /tmp/.X11-unix/ (where our VNC locks are placed).

      Could locks in /tmp/.X*-lock and /tmp/.X11-unix/X* be considered when trying to start a new display? Could we allow more retries than the current 3? Or do you have any other ideas on how to address this issue?

      Attachments

        Activity

          If necessary I could contribute with a patch, but if so I'd appreciate if you could point me in a good direction.

          davidparsson David Pärsson added a comment - If necessary I could contribute with a patch, but if so I'd appreciate if you could point me in a good direction.
          jglick Jesse Glick added a comment -

          Not sure what the root cause is. The plugin maintains a list of free display numbers so it should not be attempting to reuse one unless that build is done. Perhaps the vncserver -kill at the end is failing?

          jglick Jesse Glick added a comment - Not sure what the root cause is. The plugin maintains a list of free display numbers so it should not be attempting to reuse one unless that build is done. Perhaps the vncserver -kill at the end is failing?
          davidparsson David Pärsson added a comment - - edited

          I think the actual cause in this case was that we ran out of TCP ports in VNC's port range because of a misconfigured Jenkins machine under heavy load, but I think I've seen this a few times before that bad configuration was applied as well.

          Is it so expensive to try to start a VNC server so that we can't afford to try more than three times? And why are displays never reused?

          davidparsson David Pärsson added a comment - - edited I think the actual cause in this case was that we ran out of TCP ports in VNC's port range because of a misconfigured Jenkins machine under heavy load, but I think I've seen this a few times before that bad configuration was applied as well. Is it so expensive to try to start a VNC server so that we can't afford to try more than three times? And why are displays never reused?
          jglick Jesse Glick added a comment -

          Trying more than three times would probably not hurt but I doubt it would help. The real problem is that displays are not being reused in your case. I have no idea why; you will need to debug it.

          I just committed (but have not yet released) a fix for JENKINS-12431; probably unrelated but worth checking just in case.

          jglick Jesse Glick added a comment - Trying more than three times would probably not hurt but I doubt it would help. The real problem is that displays are not being reused in your case. I have no idea why; you will need to debug it. I just committed (but have not yet released) a fix for JENKINS-12431 ; probably unrelated but worth checking just in case.
          jglick Jesse Glick added a comment -

          https://github.com/jenkinsci/xvnc-plugin/pull/2 purports to fix this or something similar but the root cause of the problem is not explained or directly addressed.

          jglick Jesse Glick added a comment - https://github.com/jenkinsci/xvnc-plugin/pull/2 purports to fix this or something similar but the root cause of the problem is not explained or directly addressed.

          That's from a colleague of mine, and the fix seems to have resolved the problem for us. The root cause was related to external factors. From my point of view this issue could be considered as resolved.

          davidparsson David Pärsson added a comment - That's from a colleague of mine, and the fix seems to have resolved the problem for us. The root cause was related to external factors. From my point of view this issue could be considered as resolved.

          Resolved in xvnc-1.12

          levsa Levon Saldamli added a comment - Resolved in xvnc-1.12

          People

            Unassigned Unassigned
            davidparsson David Pärsson
            Votes:
            1 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: