CrashLoopBackOff — what it means and how to fix it
Quick answer: CrashLoopBackOff means your container starts, exits, and Kubernetes keeps restarting it with an ever-growing backoff delay. The crash is your app's, not Kubernetes'. Read the previous instance's logs with kubectl logs <pod> --previous, check the exit code and events in kubectl describe pod <pod>, then fix the root cause — usually bad config, a failing probe, a missing dependency, or an out-of-memory kill.
What you'll see
The pod never reaches Running for long, and the restart count climbs:
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
api-7d9f8c6b4-x2k9p 0/1 CrashLoopBackOff 6 4m
"BackOff" just means Kubernetes is waiting longer between each restart (10s, 20s, 40s… up to 5 min). It's a symptom — the real failure is whatever makes the container exit. Your job is to read why it exited.
Why a container crash-loops
The app exits non-zero on startup
A missing env var, an unreachable database, or an unhandled exception kills the process the moment it boots.
A liveness probe is failing
If the app is slow to start, the liveness probe trips and Kubernetes kills a perfectly healthy container before it's ready.
OOMKilled — out of memory
The container exceeded its memory limit and was killed. You'll see Reason: OOMKilled and exit code 137 in describe.
Missing config or secret
A referenced ConfigMap/Secret key doesn't exist, or a mounted file the app expects isn't there, so it aborts immediately.
Diagnose it in three steps
Read the logs of the crashed instance
The current container may be too new to have logs — use --previous to see the one that just died:
kubectl logs <pod> --previous
kubectl logs <pod> --previous --tail=50Check the exit code and events
kubectl describe pod <pod>
# Look at: Last State, Reason, Exit Code, and the Events list.
# 137 = OOMKilled or SIGKILL. 1/2 = app error. 127 = command not found.Reproduce without the loop
Override the entrypoint so the container stays up and you can poke around:
kubectl run debug --rm -it --image=YOUR_IMAGE --command -- sh
# then run your start command by hand and watch it failSeparate "is it alive" from "is it ready"
A huge share of crash loops are self-inflicted: a liveness probe with too short a delay kills an app that's simply slow to boot. Use a readiness probe to gate traffic, and give liveness a generous startup window (or a startupProbe):
startupProbe:
httpGet: { path: /healthz, port: 3000 }
failureThreshold: 30
periodSeconds: 5 # up to 150s to boot before liveness kicks in
livenessProbe:
httpGet: { path: /healthz, port: 3000 }
periodSeconds: 10
If it's OOMKilled, raise the memory limit or fix the leak. If it's a config error, the fix is in your logs from step 1 — crash loops are loud once you read the right container.
Supervise services without the Kubernetes restart-loop tax
If you're fighting CrashLoopBackOff, it's worth asking whether you need Kubernetes' complexity at all. Infraveil is a backend operations control plane that runs on your own servers: it starts your services, health-checks them, and restarts what genuinely fails — without probe tuning, backoff math, or YAML archaeology. A failed deploy is caught at the gate and rolled back, not looped forever.
Frequently asked questions
What does CrashLoopBackOff mean?
Your container starts, exits, and Kubernetes restarts it repeatedly with a growing delay between attempts. The state describes the restart pattern; the actual fault is whatever makes the container exit.
How do I see why the pod is crashing?
Run kubectl logs <pod> --previous to read the crashed container's output, and kubectl describe pod <pod> to see the exit code and events.
What does exit code 137 mean?
137 means the process was killed by SIGKILL (128 + 9). Most often it's OOMKilled — the container hit its memory limit. Raise the limit or fix the memory leak.
Can a health check cause CrashLoopBackOff?
Yes. A liveness probe that starts too early or has too short a timeout will kill an app that's merely slow to boot. Use a startupProbe or a longer initial delay, and gate traffic with a readiness probe instead.