-
Bug
-
Resolution: Fixed
-
Major
-
jenkins running on linux host
builds tied to remote mac host
have several matrix builds
Sometimes jobs get blocked by each other or "deadlock". We must manually cancel and restart the builds.
IRC Transcript
--------------
amrox: Hello. I had a brief exchange with @jenkins on twitter yesterday http://dl.dropbox.com/u/45634/Screen%20Shot%202011-09-08%20at%205.18.05%20PM.png
amrox: my jenkins fell into a "deadlock" state again
amrox: it's still in that state now. what information can I provide?
rtyler: *nudges kohsuke *
rtyler: amrox: I'm @jenkinsci FWIW
amrox: rtyler: hi, and thanks for the responses yesterday
amrox: heres a screen recording... best way I could think to show the issue http://dl.dropbox.com/u/45634/Screen%20Recording%204.mov
farshidghods left the room (quit: Quit: Leaving.). (5:30:12 PM)
mconigliaro left the room (quit: Quit: mconigliaro). (5:30:16 PM)
amrox: I think it can be solved by just adding another executor
amrox: but I'd like to avoid that, and it seems like a bug?
kohsuke: amrox: we need thread dump. see https://wiki.jenkins-ci.org/display/JENKINS/Build+is+hanging
amrox: jenkins master dump: http://dl.dropbox.com/u/45634/threaddump.html
amrox: slave: http://dl.dropbox.com/u/45634/slave-threaddump.html
amrox: is that format acceptable? helpful at all?
amrox: should I just file a bug?
kennethreitz kennethre@c-24-127-96-129.hsd1.va.comcast.net entered the room. (5:45:36 PM)
kohsuke: amrox: yes, that'd be great
kohsuke: jenkins-admin: create ant-plugin on github for kohsuke
kohsuke: jenkins-admin: create javadoc-plugin on github for kohsuke
amrox: kohsuke: will do thanks
kohsuke: Is "content_viewer_ios_develop_build" the hanging job?
kohsuke: OK, so the issue is that the matrix parents are blocking the execution of its child builds
kohsuke: but the parent is also waiting for the completion of the child builds, hence the dead lock
kohsuke: amrox: ^^ did I get that right?
kohsuke: the question is why content_asset_verify_auto build is occupying an executor
amrox: yes that seems accurate
kohsuke: content_viewer_ios_develop_build is correctly using a temporary flyweight executor
kohsuke: ... as seen by the lack of number in the executor table
kohsuke: amrox: I assume all those builds are tried to remote-macslave-1
amrox: kohsuke: yes
kohsuke: OK. We'll capture this in the ticket you'll create
kohsuke: Thanks for bringing this to our attention, and sorry for the bug
amrox: thanks for building and maintaining jenkins 
- is blocking
-
JENKINS-4873 Deadlock between backup plugin and matrix project build
- Closed
- is related to
-
JENKINS-24519 Flyweight tasks only use one-off executor when they can be scheduled immediately
- Resolved
- links to