Back to posts
May 10, 2026

Before you ship — a pre-deploy checklist for AI-built apps

AI builders make 90% of the right decisions and 10% of the catastrophically wrong ones. The wrong ones cluster in predictable places. Here's the 24-point checklist to run through before you deploy — copy-paste version included.

Your app works on localhost. Production is a different thing.

The fastest way to find out you have a bug in production is to deploy. The cheapest way is this checklist. Going through it takes about an hour. Skipping it costs days.

This isn't paranoia. AI builders get 90% of the right decisions and 10% of the catastrophically wrong ones — and the wrong ones cluster in the same places, every time. That's exactly what the checklist targets.

The eight categories

Each category is 2–4 concrete items. None take more than a few minutes to verify.

1. Secrets & auth (4 items)

  • No API keys in code. Run grep -rE "sk-|pk_live|Bearer " . --exclude-dir=node_modules --exclude-dir=.next in your repo. Anything that matches outside .env is a leak. Fix it before doing anything else.
  • Every endpoint that returns user data is authenticated. List your routes. For each one, mark whether auth is enforced server-side, not just on the client. AI generates client-side guards that look like security and aren't.
  • Sessions invalidate server-side, not just by deleting a cookie. A logged-out session token should fail validation on the server. Test it with curl and the old token.
  • Password reset has rate limits and real email confirmation. Anything else is a free password-reset spammer for whoever finds your domain.

2. Database (4 items)

  • Backups are running and recent. Check the timestamp on the latest one. "We're on managed Postgres" is not the same as "backups are happening." Verify.
  • Migrations are reversible. Or you have a rollback script saved somewhere. The migration that runs at 2 a.m. and fails halfway is the one that needs a down script.
  • Row-level security is on if you're on Supabase or any shared-schema setup. AI often disables it during development "to make things work" and forgets to turn it back on.
  • Database connection uses TLS. sslmode=require minimum. Your connection string is plaintext over the wire otherwise.

3. Cost & rate limits (3 items)

  • LLM API has a hard monthly cap or billing alert at $50, not $5000. Anthropic and OpenAI both support this. Set it before you ship.
  • Per-user rate limit on any endpoint that calls an LLM. Otherwise one bad actor (or one infinite loop in your own code) drains your budget overnight.
  • Per-IP rate limit on auth, signup, password reset, and any expensive endpoint. Composite keying (IP + body fingerprint) catches more than IP alone.

4. Tests (2 items)

  • You ran tests locally, not just in CI. CI environments are clean and idealized. Local environments expose timing, env, and dependency issues that CI hides.
  • You manually clicked through the golden path in a real browser. Type-checks and unit tests are not a substitute for using the feature.

5. Observability (3 items)

  • Errors are logged somewhere you'll actually read. Sentry, Logtail, Better Stack — anything that isn't console.error going to a log file nobody opens.
  • You have at least one alert. Uptime is the minimum. Free tiers exist (UptimeRobot, BetterStack, Hetrix). The first time your site goes down at 4 a.m., you will be glad.
  • You can see your traffic. Plausible, Umami, PostHog. Knowing nobody visited the page you spent two days on is information. So is the spike from a Hacker News post you didn't see coming.

6. Performance & limits (3 items)

  • No unbounded queries. Pagination on every list endpoint, with a default and a max. The query that returns 50 rows in dev and 500,000 in prod is a real story.
  • File uploads have server-side size limits. Client-side limits are decoration. The first abusive user discovers this in 30 seconds.
  • Long-running operations don't block the response. LLM calls, exports, image processing — push them to a queue or stream the response. A 60-second request is a 504 in most production environments.

7. Security headers (3 items)

  • CSP is enforced (not report-only) in production. AI's first instinct when something doesn't load is to widen the policy. Audit before deploy.
  • HSTS enabled with at least 6 months. Strict-Transport-Security: max-age=15552000; includeSubDomains. Set once and forget.
  • Other defaults set. X-Content-Type-Options: nosniff. Referrer-Policy: strict-origin-when-cross-origin. X-Frame-Options: DENY unless you're embedding the site somewhere on purpose.

8. Rollback plan (2 items)

  • You can revert this deploy in under 5 minutes. Vercel, Netlify, Render, Fly — yes, by design. Custom infra — make sure you have a one-command rollback script and you've used it once.
  • You know who's on-call if something breaks. Even if it's just you with a phone alert. The 4 a.m. outage you don't know about for six hours is the one that costs you customers.

The compact version

Save this in your repo as DEPLOY-CHECKLIST.md. Run through it before every deploy. Takes 15 minutes once you've done it twice.

SECRETS & AUTH
[ ] No secrets in code (grep clean)
[ ] All user-data endpoints authenticated server-side
[ ] Sessions invalidate server-side
[ ] Password reset has rate limits + email confirmation

DATABASE
[ ] Backups recent and verified
[ ] Migrations reversible
[ ] Row-level security on (if applicable)
[ ] DB connection uses TLS

COST & RATE LIMITS
[ ] LLM billing alert set ($50, not $5000)
[ ] Per-user rate limit on LLM endpoints
[ ] Per-IP rate limit on auth + expensive endpoints

TESTS
[ ] Tests run locally, not just CI
[ ] Manual click-through done

OBSERVABILITY
[ ] Errors logged to a monitored service
[ ] Uptime alert active
[ ] Traffic analytics in place

PERFORMANCE
[ ] No unbounded queries (pagination everywhere)
[ ] File upload size limits enforced server-side
[ ] Long operations queued or streamed

HEADERS
[ ] CSP enforced (not report-only)
[ ] HSTS enabled, 6+ months
[ ] X-Content-Type-Options, Referrer-Policy, X-Frame-Options set

ROLLBACK
[ ] Revert tested, under 5 minutes
[ ] On-call / alert path defined

Why this exists at all

Every item on this list comes from somebody's bad day. The grep for API keys is on the list because someone shipped Stripe keys to a public repo. The unbounded query item is on the list because a "small dashboard" timed out on the first user with real data. The rollback item is on the list because someone's first prod deploy bricked the app and they didn't have a way back.

AI doesn't intuit any of this. It writes code that works for the inputs it imagined. Production has the inputs nobody imagined.

Run the checklist. Deploy with one less reason to be awake at 4 a.m.

Get these in your inbox every Sunday — no daily spam, just the weekly note plus a few hand-picked links. Subscribe on the homepage.