-
Bug
-
Resolution: Fixed
-
Critical
-
master on Linux - 2-3 Linux slaves with 10 executors - 1 windows slave with one executor. Problem only shows up on linux slaves where we have the highest load - up to 30 parallel jobs
Thread has died
java.lang.IllegalStateException: cannot create a build with number 8558 since that (or higher) is already in use among [8099, 8312, 8317, 8318, 8319, 8320, 8321, 8322, 8323, 8326, 8328, 8329, 8330, 8331, 8333, 8335, 8336, 8338, 8340, 8341, 8348, 8351, 8355, 8358, 8360, 8361, 8362, 8363, 8370, 8371, 8380, 8381, 8386, 8387, 8394, 8397, 8398, 8399, 8400, 8401, 8402, 8403, 8404, 8405, 8406, 8407, 8408, 8409, 8418, 8419, 8420, 8421, 8422, 8423, 8424, 8426, 8428, 8435, 8436, 8440, 8441, 8442, 8484, 8487, 8488, 8489, 8490, 8491, 8492, 8493, 8495, 8497, 8498, 8499, 8500, 8501, 8508, 8512, 8513, 8514, 8515, 8522, 8523, 8524, 8526, 8527, 8528, 8529, 8530, 8531, 8535, 8536, 8537, 8545, 8546, 8549, 8550, 8552, 8554, 8555, 8556, 8557, 8560, 8563]
at jenkins.model.lazy.AbstractLazyLoadRunMap.proposeNewNumber(AbstractLazyLoadRunMap.java:361)
at hudson.model.RunMap.put(RunMap.java:189)
at hudson.matrix.MatrixConfiguration.newBuild(MatrixConfiguration.java:284)
at hudson.matrix.MatrixConfiguration.newBuild(MatrixConfiguration.java:74)
at hudson.model.AbstractProject.createExecutable(AbstractProject.java:1205)
at hudson.model.AbstractProject.createExecutable(AbstractProject.java:144)
at hudson.model.Executor.run(Executor.java:213)
more info
- is blocking
-
JENKINS-24380 Use build numbers as IDs
-
- Resolved
-
- is duplicated by
-
JENKINS-27081 Executors are dying after upgrading to 1.599
-
- Resolved
-
-
JENKINS-26616 Multi-config project concurrent build race
-
- Resolved
-
- is related to
-
JENKINS-26582 ISE from RunMap.put using /git/notifyCommit on a matrix project
-
- Closed
-
- links to
[JENKINS-26739] ISE from AbstractLazyLoadRunMap.proposeNewNumber for concurrent matrix builds
Thanks for your feedback Daniel.
nextBuildNumber at the moment holds 8610 - which is true having a look at the web console https://ci.owncloud.org/job/pull-request-analyser-ng-simple/
To me this feels like some runtime/concurrency issue as soon as too many builds are triggered the same time.
This job is hooked up with github and jobs are being kicked off as soon as a pull request is created or new commits are pushed to the branches.
As desribed in the environment field - this can be up to 30 jobs running in parallel. Under these circumstances the executors die.
Any specific logger category of interest to analyse this issue?
THX
Builds in question are gone. Would have been interesting to see whether they were created in quick succession (within a few seconds at most).
Any further occurrences of this specific issue?
Assigning to jglick as it's related to the new build numbering, may be a concurrency issue when many builds are started in parallel (since they now can be started at a rate of more than 1/second).
Well Job.assignBuildNumber is synchronized so new Run instances should never collide on number. But some sort of race condition seems like a likely explanation.
Similar to JENKINS-26582 but with a different stack trace, which may or may not be significant.
Are you using any plugins which might do funny things with builds—Heavy Job, Gerrit Trigger (known to be buggy unless you install the new beta), etc.?
The job in question is based on the github pull request builder - might that be the reason?
Could a quite period help? Let me try that ...
okay - quite period doesn't help ... feel like a bit less - but we are not that acive at the moment.
Any further ideas on this? THX
After upgrade from 1.596 to 1.599 I got same issues. It happened on matrix job.
Here is a log, see how same build numbers are probed several times and failed:
Feb 18, 2015 6:55:09 AM hudson.model.Executor run SEVERE: Unexpected executor death java.lang.IllegalStateException: cannot create a build with number 2439 since that (or higher) is already in use among [2376, 2377, 2378, 2379, 2380, 2381, 2382, 2383, 2384, 2385, 2386, 2387, 2388, 2389, 2390, 2391, 2392, 2393, 2394, 2395, 2396, 2397, 2398, 2399, 2400, 2401, 2402, 2403, 2404, 2405, 2406, 2407, 2408, 2409, 2410, 2411, 2412, 2413, 2414, 2415, 2416, 2417, 2418, 2419, 2420, 2421, 2422, 2423, 2424, 2425, 2426, 2427, 2428, 2429, 2430, 2431, 2432, 2433, 2434, 2435, 2436, 2441] at jenkins.model.lazy.AbstractLazyLoadRunMap.proposeNewNumber(AbstractLazyLoadRunMap.java:361) at hudson.model.RunMap.put(RunMap.java:189) at hudson.matrix.MatrixConfiguration.newBuild(MatrixConfiguration.java:284) at hudson.matrix.MatrixConfiguration.newBuild(MatrixConfiguration.java:74) at hudson.model.AbstractProject.createExecutable(AbstractProject.java:1205) at hudson.model.AbstractProject.createExecutable(AbstractProject.java:144) at hudson.model.Executor.run(Executor.java:213) Feb 18, 2015 6:55:09 AM hudson.model.Executor run SEVERE: Unexpected executor death java.lang.IllegalStateException: cannot create a build with number 2440 since that (or higher) is already in use among [2376, 2377, 2378, 2379, 2380, 2381, 2382, 2383, 2384, 2385, 2386, 2387, 2388, 2389, 2390, 2391, 2392, 2393, 2394, 2395, 2396, 2397, 2398, 2399, 2400, 2401, 2402, 2403, 2404, 2405, 2406, 2407, 2408, 2409, 2410, 2411, 2412, 2413, 2414, 2415, 2416, 2417, 2418, 2419, 2420, 2421, 2422, 2423, 2424, 2425, 2426, 2427, 2428, 2429, 2430, 2431, 2432, 2433, 2434, 2435, 2436, 2441] at jenkins.model.lazy.AbstractLazyLoadRunMap.proposeNewNumber(AbstractLazyLoadRunMap.java:361) at hudson.model.RunMap.put(RunMap.java:189) at hudson.matrix.MatrixConfiguration.newBuild(MatrixConfiguration.java:284) at hudson.matrix.MatrixConfiguration.newBuild(MatrixConfiguration.java:74) at hudson.model.AbstractProject.createExecutable(AbstractProject.java:1205) at hudson.model.AbstractProject.createExecutable(AbstractProject.java:144) at hudson.model.Executor.run(Executor.java:213) Feb 18, 2015 6:55:14 AM hudson.model.Executor run SEVERE: Unexpected executor death java.lang.IllegalStateException: cannot create a build with number 2439 since that (or higher) is already in use among [2379, 2380, 2381, 2382, 2383, 2384, 2385, 2386, 2387, 2388, 2389, 2390, 2391, 2392, 2393, 2394, 2395, 2396, 2397, 2398, 2399, 2400, 2401, 2402, 2403, 2404, 2405, 2406, 2407, 2408, 2409, 2410, 2411, 2412, 2413, 2414, 2415, 2416, 2417, 2418, 2419, 2420, 2421, 2422, 2423, 2424, 2425, 2426, 2427, 2428, 2429, 2430, 2431, 2432, 2433, 2434, 2435, 2436, 2437, 2438, 2441] at jenkins.model.lazy.AbstractLazyLoadRunMap.proposeNewNumber(AbstractLazyLoadRunMap.java:361) at hudson.model.RunMap.put(RunMap.java:189) at hudson.matrix.MatrixConfiguration.newBuild(MatrixConfiguration.java:284) at hudson.matrix.MatrixConfiguration.newBuild(MatrixConfiguration.java:74) at hudson.model.AbstractProject.createExecutable(AbstractProject.java:1205) at hudson.model.AbstractProject.createExecutable(AbstractProject.java:144) at hudson.model.Executor.run(Executor.java:213) Feb 18, 2015 6:55:15 AM hudson.model.Executor run SEVERE: Unexpected executor death java.lang.IllegalStateException: cannot create a build with number 2440 since that (or higher) is already in use among [2379, 2380, 2381, 2382, 2383, 2384, 2385, 2386, 2387, 2388, 2389, 2390, 2391, 2392, 2393, 2394, 2395, 2396, 2397, 2398, 2399, 2400, 2401, 2402, 2403, 2404, 2405, 2406, 2407, 2408, 2409, 2410, 2411, 2412, 2413, 2414, 2415, 2416, 2417, 2418, 2419, 2420, 2421, 2422, 2423, 2424, 2425, 2426, 2427, 2428, 2429, 2430, 2431, 2432, 2433, 2434, 2435, 2436, 2437, 2438, 2441] at jenkins.model.lazy.AbstractLazyLoadRunMap.proposeNewNumber(AbstractLazyLoadRunMap.java:361) at hudson.model.RunMap.put(RunMap.java:189) at hudson.matrix.MatrixConfiguration.newBuild(MatrixConfiguration.java:284) at hudson.matrix.MatrixConfiguration.newBuild(MatrixConfiguration.java:74) at hudson.model.AbstractProject.createExecutable(AbstractProject.java:1205) at hudson.model.AbstractProject.createExecutable(AbstractProject.java:144) at hudson.model.Executor.run(Executor.java:213) Feb 18, 2015 6:55:19 AM hudson.model.Executor run SEVERE: Unexpected executor death java.lang.IllegalStateException: cannot create a build with number 2439 since that (or higher) is already in use among [2376, 2377, 2378, 2379, 2380, 2381, 2382, 2383, 2384, 2385, 2386, 2387, 2388, 2389, 2390, 2391, 2392, 2393, 2394, 2395, 2396, 2397, 2398, 2399, 2400, 2401, 2402, 2403, 2404, 2405, 2406, 2407, 2408, 2409, 2410, 2411, 2412, 2413, 2414, 2415, 2416, 2417, 2418, 2419, 2420, 2421, 2422, 2423, 2424, 2425, 2426, 2427, 2428, 2429, 2430, 2431, 2432, 2433, 2434, 2435, 2436, 2437, 2438, 2441] at jenkins.model.lazy.AbstractLazyLoadRunMap.proposeNewNumber(AbstractLazyLoadRunMap.java:361) at hudson.model.RunMap.put(RunMap.java:189) at hudson.matrix.MatrixConfiguration.newBuild(MatrixConfiguration.java:284) at hudson.matrix.MatrixConfiguration.newBuild(MatrixConfiguration.java:74) at hudson.model.AbstractProject.createExecutable(AbstractProject.java:1205) at hudson.model.AbstractProject.createExecutable(AbstractProject.java:144) at hudson.model.Executor.run(Executor.java:213) Feb 18, 2015 6:55:20 AM hudson.model.Executor run SEVERE: Unexpected executor death java.lang.IllegalStateException: cannot create a build with number 2440 since that (or higher) is already in use among [2376, 2377, 2378, 2379, 2380, 2381, 2382, 2383, 2384, 2385, 2386, 2387, 2388, 2389, 2390, 2391, 2392, 2393, 2394, 2395, 2396, 2397, 2398, 2399, 2400, 2401, 2402, 2403, 2404, 2405, 2406, 2407, 2408, 2409, 2410, 2411, 2412, 2413, 2414, 2415, 2416, 2417, 2418, 2419, 2420, 2421, 2422, 2423, 2424, 2425, 2426, 2427, 2428, 2429, 2430, 2431, 2432, 2433, 2434, 2435, 2436, 2437, 2438, 2441] at jenkins.model.lazy.AbstractLazyLoadRunMap.proposeNewNumber(AbstractLazyLoadRunMap.java:361) at hudson.model.RunMap.put(RunMap.java:189) at hudson.matrix.MatrixConfiguration.newBuild(MatrixConfiguration.java:284) at hudson.matrix.MatrixConfiguration.newBuild(MatrixConfiguration.java:74) at hudson.model.AbstractProject.createExecutable(AbstractProject.java:1205) at hudson.model.AbstractProject.createExecutable(AbstractProject.java:144) at hudson.model.Executor.run(Executor.java:213)
Hi All,
For the issue resolving I downgraded Jenkins version to 1.595.
This proves that the problem in the Jenkins core.
The bug appears to be in Matrix Plugin (where it creates MatrixRuns for MatrixConfigurations with build numbers synchronized with their parent MatrixBuild) but the new, stricter validation in Jenkins core results in the broken behavior now having actual consequences.
I see, the overloads of getNextBuildNumber and assignBuildNumber in MatrixConfiguration are rather dangerous.
The good news is that after updating the plugin’s baseline to 1.600, a similar error appears for example in MatrixProjectTest.testConcurrentBuild, which means I have something concrete to go on.
Code changed in jenkins
User: Jesse Glick
Path:
src/test/java/hudson/matrix/MatrixProjectTest.java
http://jenkins-ci.org/commit/matrix-project-plugin/5f011e8f2718937295584cdf1101941631d6e7ec
Log:
JENKINS-26739 Suppress testConcurrentBuild until we can pick up a fix.
Code changed in jenkins
User: Jesse Glick
Path:
core/src/main/java/hudson/model/RunMap.java
http://jenkins-ci.org/commit/jenkins/5ada903554a493fcfbc8e23c5c8d02bc50c17845
Log:
[FIXED JENKINS-26739] Suppress monotonicity assertion for MatrixRun.
Code changed in jenkins
User: Jesse Glick
Path:
changelog.html
core/src/main/java/hudson/model/RunMap.java
http://jenkins-ci.org/commit/jenkins/cf2b02dc58c140aafc4e20115a8bc936d05815be
Log:
JENKINS-26739 Merging #1592.
Compare: https://github.com/jenkinsci/jenkins/compare/87ab95e7fc30...cf2b02dc58c1
Code changed in jenkins
User: Jesse Glick
Path:
changelog.html
core/src/main/java/hudson/model/RunMap.java
http://jenkins-ci.org/commit/jenkins/6ee6fc3b2f7a9bceca1b04c4af5e6f1b024e2ab4
Log:
JENKINS-26739 Backporting #1592 to rc.
(cherry picked from commit cf2b02dc58c140aafc4e20115a8bc936d05815be)
Conflicts:
changelog.html
Integrated in jenkins_main_trunk #3995
[FIXED JENKINS-26739] Suppress monotonicity assertion for MatrixRun. (Revision 5ada903554a493fcfbc8e23c5c8d02bc50c17845)
Result = SUCCESS
jesse glick : 5ada903554a493fcfbc8e23c5c8d02bc50c17845
Files :
- core/src/main/java/hudson/model/RunMap.java
Integrated in jenkins_main_trunk #3997
JENKINS-26739 Backporting #1592 to rc. (Revision 6ee6fc3b2f7a9bceca1b04c4af5e6f1b024e2ab4)
Result = SUCCESS
jesse glick : 6ee6fc3b2f7a9bceca1b04c4af5e6f1b024e2ab4
Files :
- changelog.html
- core/src/main/java/hudson/model/RunMap.java
I have not seen this issue anymore since we upgraded to 1.600 - many thanks!
gbougeard really with the same stack trace? Seems more likely that you are seeing some other issue with a similar but distinct symptom. File it separately, with steps to reproduce if at all possible (otherwise at least a log file, support bundle from the Support Core plugin, etc.), blocking JENKINS-24380.
Do you think it's a different issue?
Mar 25, 2015 12:57:18 PM SEVERE hudson.model.Executor run
Unexpected executor death
java.lang.IllegalStateException: /var/lib/jenkins/jobs/service-mysql-migrations_master/configurations/axis-BASE_TAG/prod/builds/218 already existed; will not overwite with service-mysql-migrations_master/BASE_TAG=prod #218
at hudson.model.RunMap.put(RunMap.java:187)
at hudson.matrix.MatrixConfiguration.newBuild(MatrixConfiguration.java:284)
at hudson.matrix.MatrixConfiguration.newBuild(MatrixConfiguration.java:74)
at hudson.model.AbstractProject.createExecutable(AbstractProject.java:1205)
at hudson.model.AbstractProject.createExecutable(AbstractProject.java:144)
gbougeard Correct, that is unrelated, and already tracked as JENKINS-26582 (currently with no known way to reproduce).
Code changed in jenkins
User: Jesse Glick
Path:
src/test/java/hudson/matrix/MatrixProjectTest.java
http://jenkins-ci.org/commit/matrix-project-plugin/66592112c2984ecd379cf6ff0ab8e1eed68dfa60
Log:
JENKINS-26739 testConcurrentBuild should pass again as of 1.602.
Reopening this issue. I'm seeing this exception and a lot of dead executors on jenkins 1.622.
java.lang.IllegalStateException: cannot create a build with number 2322 since that (or higher) is already in use among [2261, 2262, 2263, 2264, 2265, 2266, 2267, 2268, 2269, 2270, 2271, 2272, 2273, 2274, 2275, 2276, 2277, 2278, 2279, 2280, 2281, 2282, 2283, 2284, 2285, 2286, 2287, 2288, 2289, 2290, 2291, 2292, 2293, 2294, 2295, 2296, 2297, 2298, 2299, 2300, 2301, 2302, 2303, 2304, 2305, 2306, 2307, 2308, 2309, 2310, 2457, 2458, 2459, 2460, 2461, 2462, 2463, 2464, 2465, 2466, 2467, 2468, 2469, 2470, 2471, 2472, 2473, 2474, 2475, 2476, 2477, 2478, 2479, 2480, 2481, 2482, 2483, 2484, 2485, 2486, 2487, 2488, 2489, 2490, 2491, 2492, 2493, 2494, 2495, 2496, 2497, 2498, 2499, 2500, 2501, 2502, 2503, 2504, 2505, 2506]
at jenkins.model.lazy.AbstractLazyLoadRunMap.proposeNewNumber(AbstractLazyLoadRunMap.java:361)
at hudson.model.RunMap.put(RunMap.java:192)
at jenkins.model.lazy.LazyBuildMixIn.newBuild(LazyBuildMixIn.java:178)
at hudson.model.AbstractProject.newBuild(AbstractProject.java:1019)
at hudson.model.AbstractProject.createExecutable(AbstractProject.java:1218)
at hudson.model.AbstractProject.createExecutable(AbstractProject.java:144)
at hudson.model.Executor$1.call(Executor.java:335)
at hudson.model.Executor$1.call(Executor.java:317)
at hudson.model.Queue._withLock(Queue.java:1345)
at hudson.model.Queue.withLock(Queue.java:1210)
at hudson.model.Executor.run(Executor.java:317)
This issue is specifically for Matrix Projects. Your problem is with a different project type. When filing a new issue (assuming none exists yet), please try to determine why Jenkins would assign a build number much lower than the highest existing build number.
Possible. Check the file named 'nextBuildNumber' in JENKINS_HOME/jobs/(jobname). It may be as simple as changing its contents.
Anything in the logs to indicate how you got into this situation?