HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

Monitor Jenkins node/slave health

Submitted by: @import:stackexchange-devops··
0
Viewed 0 times
jenkinsnodeslavehealthmonitor

Problem

For a project I'm working on, we need to create a Jenkins node in an AWS environment. For this task, we will work with the docker image, which is available on https://github.com/jenkinsci/docker-ssh-slave
. The Jenkins node will be responsible for executing AWS commands which can only be done from inside the AWS environment (not outside).

This Jenkins Master is running in our local environment. We will add the node as a permanent node in the Jenkins master. The Jenkins master is not responsible for setting up the Jenkins Node, since it has no permissions to do this ( this is managed in AWS ).

However, we would like to monitor the Jenkins Slave inside AWS to see if it it still running successfully. With other docker containers, we also expose a health check. If the health check fails, the docker container is treated as unhealthy, and is restarted accordingly.
Just like we do with the other docker containers, we would like to expose such a health port as well on the Jenkins Slave. If it is not responding anymore, AWS will restart the docker container, and we have a stable system.

We already explored the Nagios monitoring, but it looks like this is exposing the health in the master, and not directly in the slave.

Is it available, or how can we add such functionality?
Could you lead us in the right direction, or recommend plugins which need to be installed, ...

  • Can we open the default HTTP API of Jenkins in such a Jenkins slave (https://wiki.jenkins.io/display/JENKINS/Remote+access+API)



  • Another approach could be to add and start NGINX in docker and point the healthcheck to NGINX. But with this approach, we are not really checking the health of the the Jenkins slave.

Solution

To me it looks like you are trying to reinvent the wheel. As I understood you would like to restart a container if a health check fails. As you are using docker you should use a orchestration platform in my opinion like Fargate (ECS) or Kubernetes (k8s). I know from the latter that it is possible to define healthchecks, like readiness and health probes. If the health check fails, kubernetes will restart the POD (auto healing).

It could be possible that you would argue that k8s is not an option. If that is the case, you could also have a look to AWS OpsWorks. Although I did not try it myself yet, it is able to perform auto healing as well.

Context

StackExchange DevOps Q#4336, answer score: 1

Revisions (0)

No revisions yet.