When you have a job configured to send out status emails for a job (like success, fail, still failing) the jobs will block the node for up to 15 minutes or even more, only to send out the email. Given that some of our jobs only run 2min this is an absolutely large overhead we have to get rid of soon. It's mainly blocking our internal QA team from signing off of new releases.
Here an example of the last lines of a job which has debug mode enabled for email-ext:
00:09:31.082 Archiving artifacts
00:09:31.093 Recording test results
00:09:31.218 Checking for post-build
00:09:31.218 Performing post-build step
00:09:31.219 Checking if email needs to be generated
00:09:31.219 Email was triggered for: Success
00:09:31.219 Sending email for trigger: Success
00:09:31.219 NOT overriding default server settings, using Mailer to create session
00:24:25.439 messageContentType = text/plain; charset=UTF-8
00:24:25.446 Adding recipients from recipient list
00:31:41.324 Successfully created MimeMessage
00:31:41.324 Sending email to: firstname.lastname@example.org
00:31:41.523 Finished: SUCCESS
As you can see the job is running 9:31 minutes. Then when trying to add the content (i hope that's right here) it takes about 15 minutes, and again 6 more minutes to create the MimeMessage.
During all that time Jenkins has a dramatically high cpu load and it consuming nearly all CPU power with 99% load. As more jobs are running concurrently as worse the situation is becoming.
Right now we are using version 2.32, and we haven't upgraded to the latest yet given that we don't see features or fixes included we would benefit.
Detailed information for our current problems can be found in our own issue tracker: https://github.com/mozilla/mozmill-ci/issues/301
We would appreciate a quick fix if possible. I would be around if you need more information. You can also reach me via IRC. My nickname is whimboo.