Backend Guide

How to deploy without dropping requests

The short version: Every deploy stops your old process — and if it stops abruptly, every request it was serving dies. The fix is graceful draining: before the old instance shuts down, flip it to "not ready" so the load balancer stops sending new traffic, let in-flight requests finish, close connections, then exit. Three moving parts make this work: a readiness flip, SIGTERM handling, and a drain timeout.

Why deploys drop requests

When you deploy, the orchestrator stops the old version. If it sends a kill signal and the process dies instantly, anyone mid-request gets a reset connection or a 5xx. The goal is to make "stop" mean "finish what you're doing first, then exit" — a graceful shutdown.

The three moving parts

1. Flip to 'not ready'

Fail the readiness check first so the load balancer drains you before you stop accepting connections.

2. Handle SIGTERM

Stop accepting new requests, let in-flight ones finish, close DB and queue connections, then exit 0.

3. Drain timeout

If draining hangs, force-exit before the platform's SIGKILL does — so you control the outcome.

The deploy sequence that drops nothing

Put together, a zero-drop rollout looks like this:

1. Start the NEW instance, wait until it passes readiness.
2. Add it to the load balancer.
3. Flip the OLD instance's readiness to false (stop new traffic).
4. Send SIGTERM to the old instance.
5. Old instance: finish in-flight requests, close connections, exit.
6. (Safety) force-exit if draining exceeds the grace period.

At no point is there an instance receiving traffic it can't serve, and no request is cut off mid-flight.

The keep-alive gotcha

One subtle trap: HTTP keep-alive connections stay open between requests, and server.close() waits for them. Set a short keepAliveTimeout, or actively close idle connections during shutdown, or your "graceful" drain hangs until the timeout fires every single deploy.

How Infraveil handles this

Draining handled by the agent

Infraveil's agent owns the deploy sequence on your own servers: it brings up the new instance, waits for health, drains the old one, and only then stops it — so requests aren't severed mid-flight. The whole rollout is approval-gated and recorded, with one-click rollback. You get zero-drop deploys without hand-writing the readiness-flip-and-drain dance for every service.

Old instances drain before they're stopped — requests finish cleanly
Health-gated cutover so traffic never hits an unready instance
Every deploy approval-gated and recorded, with one-click rollback

Frequently asked questions

Why does my app drop requests on deploy?

Because the old process was stopped while still serving traffic. Either it didn't handle SIGTERM, or it didn't drain in-flight requests before exiting, so the platform's forced kill cut them off.

What is connection draining?

Letting an instance finish the requests it's already handling before it shuts down, while no longer accepting new ones. It's the core of a graceful, zero-drop deploy.

What's a readiness flip?

Marking an instance "not ready" so the load balancer stops routing new requests to it — done first, before shutdown, so traffic moves away before the process stops.

How long should the drain timeout be?

Slightly longer than your slowest normal request, but shorter than the platform's kill grace period (often 30s). Force-exit at that point so you control shutdown rather than getting SIGKILLed mid-request.