Zero-downtime database migrations
The short version: a migration is the single riskiest change you make to production - it can lock a hot table, or break the running app if the schema and the code disagree. The safe path is expand-contract: make additive, backwards-compatible changes first, deploy code that works with both the old and new schema, then remove the old - in separate steps, never all at once. Avoid long locks, and always have a rollback before you run it.
Why migrations are dangerous
Locks on a hot table
A migration that takes a table lock blocks every read or write to it - that is downtime, even if the app is up.
Schema and code disagree
Drop or rename a column the running app still uses and you break production the instant the migration lands.
Long-running rewrites
A full-table rewrite or a giant UPDATE can run for minutes, holding locks and starving the database.
No way back
A destructive change with no rollback path turns a small mistake into an outage you cannot undo.
The expand-contract pattern
Expand - add, backwards-compatible
# add the new structure the OLD code can safely ignore
ALTER TABLE users ADD COLUMN email_verified boolean; -- nullable, no rewrite
CREATE INDEX CONCURRENTLY idx_users_email ON users(email); -- no table lockMigrate - deploy dual-aware code, backfill in batches
# app now reads/writes both old and new; backfill without one giant lock
UPDATE users SET email_verified = false
WHERE email_verified IS NULL AND id BETWEEN $1 AND $2; -- in batchesContract - remove the old, last and separately
# only once nothing reads the old column/table, in a later deploy:
ALTER TABLE users DROP COLUMN legacy_flag;Each phase is its own deploy and is backwards-compatible, so there is never a moment where the live schema breaks the running code - and you can stop or roll back at any step.
Avoid locks, take a backup, keep a rollback
Use the online path for anything on a busy table - CREATE INDEX CONCURRENTLY in Postgres, online DDL or gh-ost / pt-online-schema-change in MySQL - and backfill in batches rather than one big UPDATE. Take a backup before any destructive change, write and test the down migration first, and never combine an additive and a destructive change in the same step.
Size the backup window and recovery target with the RPO/RTO calculator, and make sure your connection pool has headroom - a long migration plus a deploy can briefly double connection demand.
A migration is a production change - gate it
A schema migration is the single highest-risk thing you do to a live database, which is exactly the kind of change Infraveil exists to govern. On your own servers, a migration runs as an approved, scoped, recorded change - gated before it touches production, tied to the deploy that needs it, and written to an audit trail - so “who changed the schema, when, and can we undo it” has an answer, and a risky migration cannot just slip into prod unreviewed.
Frequently asked questions
What is the expand-contract pattern?
A way to change a schema with no moment where code and database disagree. Expand: add new structure the old code ignores. Migrate: deploy code that uses both and backfill. Contract: remove the old once nothing uses it. Each step is backwards-compatible and independently deployable.
How do I add a column without downtime?
Add it nullable (modern Postgres adds a constant default instantly; older versions forced a full rewrite). Deploy code that tolerates null, backfill in batches not one big UPDATE, then add NOT NULL if needed. Never add a slow-default NOT NULL column to a large busy table in one step.
How do I add an index without locking?
Postgres: CREATE INDEX CONCURRENTLY - no blocking lock, slower, cannot run in a transaction. MySQL: online DDL or gh-ost / pt-online-schema-change for large tables. Always prefer the online path on a table taking traffic.
How do I roll back a migration?
Write and test the down path before the up, and keep migrations small and additive so they are reversible. Expand-contract helps: each step is backwards-compatible, so you can roll back code without the schema breaking. Take a backup before any destructive change and keep old structure until you are sure.
Client-side, no signup — they run in your browser.