For Jenkins Essentials, we are going to very often need to determine how healthy a Jenkins instance is.
At least for the data Snapshotting/rollback (JENKINS-49406), we will need to quickly decide if we trigger a rollback or not.
The health is probably going to be a mix of several aspects, among others, for instance:
- is Jenkins answering
- are there (more) warning, or error logs than usual? Overall, did the amount of logs explode?
- can a build run?
- ...
We need to think carefully about that I think, and write the associated JEP for review/feedback.
- relates to
-
JENKINS-49406 Design (JEP) the Evergreen snapshotting data safety system
-
- Resolved
-
-
JENKINS-50722 origins field is never saved
-
- Resolved
-
-
JENKINS-49805 Prototype error telemetry logging with a Java Util Logger configuration
-
- Closed
-
- links to
https://github.com/jenkins-infra/evergreen/pull/44
Demo that checking the HTTP status code of the /login URL, and configuring the metrics plugin to expose a jenkinsURL/metrics/evergreen/health-check URL are doable.