Developer pack
Claude Skill
Production Monitoring Setup
Stands up error tracking, uptime, and log access so you find out something broke before your users tell you.
What it does
Wires the minimum monitoring a small app needs to be operable: error tracking (Sentry or equivalent), an uptime check on the critical path, access to structured logs, and one alert threshold that actually pages you. The setup counterpart to designing what to observe — built so the first sign of a 3am outage is an alert, not an angry email.
When to use
- ✓Your app is live and you have no idea when it breaks
- ✓Setting up the operational baseline before real users arrive
- ✓You got burned by an outage you heard about from a customer
When not to use
- ✗Pre-launch with no users — wire it at launch, not before
- ✗You need a full observability platform (distributed tracing, SLOs) — that's a bigger design job
Install
Download the .zip, then unzip into your Claude skills folder.
mkdir -p ~/.claude/skills
unzip ~/Downloads/production-monitoring-setup.zip -d ~/.claude/skills/
# Restart Claude Code session.
# Skill is now available — Claude will use it when relevant.SKILL.md
SKILL.md
---
name: production-monitoring-setup
description: Use when standing up monitoring for a live app — error tracking, uptime, logs, alerts. Triggers on "set up monitoring", "Sentry", "uptime check", "error tracking", "alerting", "how do I know when my app breaks", "production monitoring".
---
# Production Monitoring Setup
The goal is narrow: find out something broke before a user does. For a small app that's four things, wired once. Don't build a NASA control room — build the smoke detector.
## 1. Error tracking
Install an error tracker (Sentry is the default; most hosts have a one-line integration) on both the server and the client. It captures the stack trace, the request, and the user context when something throws — so you debug from real data instead of "it doesn't work." Set the release/version so you can tell which deploy introduced an error.
## 2. Uptime check
One external check that hits your critical path (the real user flow, not just `/`) on an interval — e.g. UptimeRobot, BetterStack, or the host's built-in check. An internal health check can't tell you the whole box is down; an external one can.
## 3. Log access
Know where your logs are and how to read them *before* the incident: Vercel/Railway dashboard logs, or shipped to a log service. Logs should be structured enough to filter by request, route, and severity. "I can't find the error" at 3am is a setup failure.
## 4. One alert that pages you
Wire a single, high-signal alert to where you'll actually see it (Slack, email, phone): "error rate spiked" or "uptime check failed." **One good alert beats ten ignored ones** — alert fatigue is the real failure mode. Tune the threshold so it only fires on something you'd get up for.
## Sequence
Error tracking first (highest signal per minute of setup), then uptime, then confirm log access, then the one alert. Trigger a test error and a test downtime to prove each path actually reaches you — an untested alert is not an alert.
## Anti-patterns
- Uptime check that only pings the homepage while the real flow is broken
- Ten alerts nobody reads (fatigue) instead of one that matters
- Finding out where the logs are *during* the incident
- Never test-firing an alert, then discovering it was misconfigured when it mattered
- Building elaborate dashboards before the basic "are we down" signal exists
Example prompts
Once installed, try these prompts in Claude:
- Set up Sentry + an uptime check + a Slack alert for my Next.js app on Vercel.
- My app went down and I found out from a user. Give me the minimum monitoring so that never happens silently again.
Related prompts
Don't want to install a skill? These prompts in /prompts cover similar ground for one-shot use: