Loading...

XML

Word

Printable

Type: Bug
Resolution: Not A Defect
Priority: Major
Component/s: mesos-plugin
Labels:
None
Environment:
Jenkins 2.289.2
Mesos Cloud 1.7.2

Similar Issues:

Show

Hello,

after upgrading from Jenkins 2.164 to Jenkins 2.289.2 we're running into a very peculiar issue.
Pipeline jobs seem to keep dying without any particular reason. Most jobs - but not all of them - seem to stop after about 10-15 minutes.

Looking a the job logs shows only:

Agent <AgentID> was deleted; cancelling node body

The sub job - as to be expected - shows "Calling Pipeline Cancelled" and the job ends with a FlowInterruptException.

When the subjob is run independently of the pipeline it works without a hitch. I've looked at all logs related to the job itself and Mesos - but Mesos is simply doing as Jenkins instructs it, no errors or any indicators of resource problems (my first thought would have been undersized agents, but analysis showed no such thing/even properly sized agents experience this issue).

Any mesos related logs (Logger org.jenkinsci.plugins.mesos -> Log level ALL) in Jenkins merely show:

<AgentID> with slave org.jenkinsci.plugins.mesos.MesosSlave[<AgentID>] is not pending deletion or the slave is null

Until at some point it changes to:

<AgentID> with slave org.jenkinsci.plugins.mesos.MesosSlave[ null ] is not pending deletion or the slave is null

The mesos master and slave logs only show bog standard status and termination messages without any particular context to what is going wrong.

This basically blocks any pipeline from executing on this system. I should also clarify: The issue is not deterministic. Whilst most jobs fail after ten odd minutes they do not fail consistently at the same time. They vary at least by a few minutes - if the job was triggered manually they will sometimes fail after 30 minutes to an hour.

Assignee:: Vinod Kone

Reporter:: Markus Bauer

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Created:: 2021-10-12 14:03

Updated:: 2021-10-19 09:35

Resolved:: 2021-10-13 13:34

Details

Description

Attachments

Activity

People

Dates