Customer Success pack
Claude Skill
Health Score Builder
Builds a customer health score formula from real signals — usage, support, NPS, engagement. Calibrates, doesn't just average.
What it does
Given your customer base context (segment, product surface area, what "good" usage looks like), produces a health score formula with weighted signals, defined thresholds, and a calibration check against known-good and known-bad customers. Avoids the "average everything to a 0-100" trap that surfaces no real signal.
When to use
- ✓You're building a health score in Gainsight, Catalyst, ChurnZero from scratch
- ✓Your existing health score doesn't correlate with renewals — time to rebuild it
- ✓You're a CS leader establishing what to measure for the team
When not to use
- ✗You don't have access to usage / support / NPS data — health scores require data
- ✗You have <30 customers — scores are noisy at that size, manage by hand
- ✗You want a score because your boss wants a dashboard. Build the practice first, the score reflects it.
Install
Download the .zip, then unzip into your Claude skills folder.
mkdir -p ~/.claude/skills
unzip ~/Downloads/health-score-builder.zip -d ~/.claude/skills/
# Restart Claude Code session.
# Skill is now available — Claude will use it when relevant.SKILL.md
SKILL.md
---
name: health-score-builder
description: Use when designing or rebuilding a customer health score in Gainsight, Catalyst, ChurnZero, or a homegrown system. Triggers on "health score", "health score formula", "redesign our health score".
---
# Health Score Builder
Most health scores are useless because they average everything into a number that doesn't drive action. A good health score is calibrated against actual outcomes (did this customer renew? did they expand?) and lets a CSM know which signal to act on, not just "you're red."
## Required inputs
1. **Customer segment** — SMB / mid-market / enterprise (different signals matter)
2. **Available data sources** — product usage (which events?), support (Zendesk / Intercom volume + CSAT), NPS / surveys, billing, exec engagement tracking
3. **5-10 known-good customers** (renewed and expanded) and **5-10 known-bad** (churned or contracted)
4. **Time horizon** — score for 30-day risk? 90-day? renewal?
5. **Existing formula if any** — what's broken about it
If user has no known-good / known-bad list, push back. You cannot calibrate without ground truth.
## Design framework
### Step 1: Identify the 4-6 signals that actually matter
Not 20. Not 3. Around 5. For each, define:
- **What it measures** (concrete event or metric)
- **Why it predicts** renewal or churn (your hypothesis)
- **Threshold for green / yellow / red**
Common high-signal candidates by segment:
**Mid-market / Enterprise**:
- Core feature adoption (specific features, not aggregate logins)
- Exec sponsor engagement (last touchpoint with VP+ stakeholder, in days)
- Multi-threading (number of distinct active users from customer)
- Support sentiment (P1 frequency + CSAT)
- Time-to-value milestone hit (did they reach the activation moment?)
**SMB**:
- Login recency (more useful here than at enterprise)
- Single-threaded risk (one user = high risk)
- Support volume normalized by ARR
### Step 2: Weight the signals
NOT equal weights. Typically:
- Exec sponsor engagement: 25-30%
- Core feature adoption: 25-30%
- Multi-threading: 15-20%
- Support / sentiment: 10-15%
- Other (NPS, billing, etc.): 10-15%
Calibrate weights against your known-good vs known-bad list. If a signal has the same value for both, it's useless — drop or replace.
### Step 3: Define thresholds (per signal)
Don't use a 0-100 score. Use **green / yellow / red** per signal. Examples:
**Exec sponsor engagement**:
- Green: touchpoint with VP+ in last 60 days
- Yellow: 60-120 days
- Red: 120+ days, or sponsor changed
**Core feature adoption**:
- Green: weekly active in 3+ core features
- Yellow: weekly active in 1-2
- Red: <50% MoM in any core feature, or zero use of feature within 90 days of contract start
### Step 4: Aggregate signal — but preserve detail
Top-line score (Green / Yellow / Red overall) PLUS the per-signal status. The CSM needs to know not just "yellow" but "yellow because exec sponsor stale and core feature usage decaying — multi-threading and support are fine."
### Step 5: Calibrate
Run the formula against known-good and known-bad customers from 6-12 months ago.
- Did churned customers score Red 60+ days before churn?
- Did renewed customers score Green or Yellow consistently?
- If signals don't separate the two groups, redesign.
## Output
### The formula
4-6 signals, weights, thresholds, aggregate logic.
### Calibration table
| Customer | Outcome (renewed/churned/expanded) | Score 90 days before outcome | Did formula predict? |
### Action playbook tied to score
- Red overall: CSM intervention within 7 days, escalate to VP CS
- Red on exec engagement: AE + CSM joint outreach in 14 days
- Red on usage: trigger a usage audit + champion conversation
- Yellow: monitor, no auto-trigger
### Known limits
What this score WON'T catch:
- Acquisitions / external events
- Champion change inside green-light period
- Competitive replacement before signals show
Document these. Score should drive action; humans handle the rest.
## Anti-patterns
- Averaging 12 signals into one number that obscures which is failing
- Scoring everything 0-100 with no calibration against actual renewal data
- Using NPS as a primary signal — it correlates poorly with B2B renewal
- "Logins" as a feature signal — logins ≠ value
- Scoring without exec sponsor data when you have it
## Tone
Mechanical and skeptical. A health score is a model; treat it like one. If a signal doesn't separate winners from losers, kill it.
Example prompts
Once installed, try these prompts in Claude:
- Build a health score for our $50k-$500k mid-market segment. We have product usage data (logins, 3 core feature usage), Zendesk tickets, NPS quarterly, and exec sponsor mapping. Help me design the formula.
- Our current health score is just "logins last 30 days" averaged with NPS. It doesn't predict churn. Help me redesign it.