What is the difference between liveness and readiness?

Liveness (/healthz) answers is the process alive - it should return 200 as long as the app is running and able to serve, and it must NOT check external dependencies. Readiness (/readyz) answers can it serve traffic right now - it checks that the things the app needs, like the database, are reachable, and returns 503 when they are not. The distinction matters because a supervisor should restart a process that fails liveness, but only stop sending it traffic when it fails readiness - restarting because the database blipped just makes an outage worse.

Why must /healthz not check the database?

Because if liveness checks the database and the database has a brief outage, every instance fails its liveness probe at once, the orchestrator kills them all, and a recoverable database blip becomes a full application restart storm. Liveness should only fail when the process itself is broken and a restart would actually help. Dependency checks belong in readiness, which removes the instance from rotation without killing it.

How do systemd and Docker use these?

A Docker HEALTHCHECK or a systemd watchdog can curl /healthz to decide whether a container or service is alive and should be restarted. A load balancer or orchestrator uses /readyz to decide whether to route traffic to an instance. The generated endpoints are designed to plug into all of them.

Free tool · Runs in your browser

Healthcheck endpoints, done right.

Most healthchecks are subtly wrong — they check the database in liveness and turn a brief blip into a restart storm. Pick your framework and get correct /healthz (is the process alive) and /readyz (are dependencies reachable) endpoints.

Framework

Readiness checks

The subtle bug in most healthchecks

It looks harmless: a single /health endpoint that pings the database and returns 200. The problem shows up the day the database has a two-second hiccup — every instance fails its check at the same moment, the orchestrator concludes they're all dead and restarts them, and a blip you'd never have noticed becomes a full restart storm. The fix is the liveness/readiness split: liveness fails only when a restart would actually help, readiness pulls an instance out of rotation without killing it.

Getting the endpoints right is half of it. The other half is what acts on them — the supervisor that restarts on a failed liveness check, the thing that notices a wedged-but-not-crashed process, and the record of when and why it happened. That layer, watching your services across the hosts you own, is what a control plane provides.

Related free tools

Deploy to your own serverPush-to-deploy, your box systemd service generatorKeep your app running Reverse proxy + HTTPSCaddy / nginx + HTTPS Backup script generatorDump, offsite, restore

All free tools →

Endpoints report. Something has to act.

Infraveil watches your healthchecks across every host you own, restarts what's actually dead, routes around what's not ready, and keeps a tamper-evident record of every recovery — the layer that turns a healthcheck into uptime.

See how it works

Get the keep-it-running playbook

Healthchecks, supervision, and recovery for a backend you run yourself. No spam.