Container health checks
Open Klant is deployed as a collection of containers. Containers can be checked if they’re running as expected, and actions can be taken by the container runtime or container orchestration (like Kubernetes and Docker) when that’s not the case, like restarting the container or removing it from the pool that serves traffic.
Health checks are responsible for detecting anomalies and reporting that a container is not running as expected. They can take different forms, for example:
running a script and checking the exit code of the process
making an HTTP request to an endpoint which responds with a success or error status code
opening a TCP connection to a particular port
This section of the documentation describes the recommended health checks to use that are provided in Open Klant, or the health checks to implement in containers of third party software typically used in an Open Klant deployment. You can incorporate these in your infrastructure code (like Helm charts).
You can find code examples of these health checks in our docker-compose.yml on Github.
Open Klant containers
HTTP service
The Open Klant web service listens on port 8000 inside the container and accepts HTTP traffic. Three endpoints are exposed for health checks.
http://localhost:8000/_healthz/livez/The liveness endpoint - checks that HTTP requests can be handled. Suitable for liveness (and readiness) probes. This is the check with lowest overhead.
http://localhost:8000/_healthz/Endpoint that checks connections with database, caches, database migration state…
Suitable for the startup probe. The most expensive check to run, as it checks all dependencies of the application.
http://localhost:8000/_healthz/readyz/The readiness endpoint - checks that requests can be handled and tests that the default cache (used by for sessions) and database connection function. Slightly more expensive than the liveness check, but it’s a good candidate for the readiness probe.
Tip
Ensure the ALLOWED_HOSTS environment variable contains localhost. See
Environment configuration reference for more details.
Tip
The executable maykin-common is available in the container which can be
used to perform the health checks, as an alternative to HTTP probes.
maykin-common health-check \
--endpoint=http://localhost:8000/_healthz/livez/ \
--timeout=3
Celery workers
The Celery Worker service is responsible for picking up and executing background tasks scheduled by the web service or Celery beat.
The worker creates and updates an event loop liveness file at
/app/tmp/celery_worker_event_loop.live, which is touched every minute. Additionally,
when the worker is ready to accept tasks, it creates the
/app/tmp/celery_worker.ready file and removes it when the worker shuts down.
The worker liveness can be checked with the maykin-common CLI:
maykin-common worker-health-check \
--broker redis://redis:6379/0 \
--liveness-file /app/tmp/celery_worker_event_loop.live \
--worker-name celery@docker
Caution
Adapt the --broker and --worker-name options to your environment.
--brokermust match the value of theCELERY_BROKERsetting.--worker-nameshould not be necessary as it is taken from theCELERY_WORKER_NAMEenvvar if set, and otherwise falls back tocelery@<hostname>, where the hostname of the container is used.If pings are failing, you may need to provide the worker name(s) explicitly.
Tip
You can also use the health checks for readiness in rolling deployments on Kubernetes, so that old pods are only stopped when the new versions are confirmed to be ready.
maykin-common worker-health-check \
--skip-ping \
--skip-event-loop-liveness \
--no-skip-readiness \
--readiness-file /app/tmp/celery_worker.ready
Celery flower
Celery Flower is a web-app which binds to port 5555 by default. You can use the
generic HTTP health check utility from maykin-common, or set up an equivalent
HTTP probe:
maykin-common health-check \
--endpoint=http://localhost:5555/ \
--timeout=3
Third party containers
Redis
The Redis container images include a command line utility - redis-cli which
has a ping command to test connectivity to the server:
redis-cli ping
The command exits with exit code 0 on success and exit code 1 on failure.
PostgreSQL
Warning
Running the database as a container can bring certain scaling and disaster recovery challenges. We only provide this check for completeness sake.
PostgreSQL container images typically include the pg_isready binary, which tests
the database connection (accepting traffic on the specified host and port). It has a
non-zero exit code when the database is not ready.
nginx
nginx proxies HTTP traffic from the browser/client to the backend service. It also serves static assets directly. The nginx config needs to be extended with location handlers for the health checks. This ensures that the health endpoints are not accessible from outside.
Example nginx configuration snippet:
location = /_healthz/ {
access_log off;
add_header Content-Type text/plain;
# block outside traffic
allow 127.0.0.1;
allow ::1;
deny all;
return 200 "ok\n";
}
location = /_healthz/livez/ {
access_log off;
add_header Content-Type text/plain;
# block outside traffic
allow 127.0.0.1;
allow ::1;
deny all;
return 200 "ok\n";
}
location = /_healthz/readyz/ {
access_log off;
add_header Content-Type text/plain;
# block outside traffic
allow 127.0.0.1;
allow ::1;
deny all;
return 200 "ok\n";
}
We recommend this cheap check for both the liveness and readiness checks.
You can then wire up an HTTP probe or curl script to make a GET call to
http://localhost:8080/_healthz/livez/. Note the port number - often the nginx
unprivileged image will be used, which binds to 8080 by default, but check your
specific environment to confirm.
Smart readiness probe
You may want to consider proxying to the backend-service for the readiness check.
Warning
This can lead to cascading failures where first your backend-service becomes unavailable, which leads to nginx becoming unavailable and possible other dependent services.
Tip
Even if the backend is not available, nginx may still be performing useful work by serving static files.
Example nginx configuration snippet:
location = /_healthz/readyz/ {
access_log off;
# block outside traffic
allow 127.0.0.1;
allow ::1;
deny all;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_set_header X-Forwarded-Host $server_name;
proxy_set_header X-Scheme $scheme;
proxy_pass http://web:8000/_health/readyz/;
}