Claude Skill

Health Score Builder

Builds a customer health score formula from real signals — usage, support, NPS, engagement. Calibrates, doesn't just average.

Download skill (.zip)Or download whole pack

What it does

Given your customer base context (segment, product surface area, what "good" usage looks like), produces a health score formula with weighted signals, defined thresholds, and a calibration check against known-good and known-bad customers. Avoids the "average everything to a 0-100" trap that surfaces no real signal.

When to use

✓You're building a health score in Gainsight, Catalyst, ChurnZero from scratch
✓Your existing health score doesn't correlate with renewals — time to rebuild it
✓You're a CS leader establishing what to measure for the team

When not to use

✗You don't have access to usage / support / NPS data — health scores require data
✗You have <30 customers — scores are noisy at that size, manage by hand
✗You want a score because your boss wants a dashboard. Build the practice first, the score reflects it.

Install

Download the .zip, then unzip into your Claude skills folder.

mkdir -p ~/.claude/skills
unzip ~/Downloads/health-score-builder.zip -d ~/.claude/skills/

# Restart Claude Code session.
# Skill is now available — Claude will use it when relevant.

SKILL.md

---
name: health-score-builder
description: Use when designing or rebuilding a customer health score in Gainsight, Catalyst, ChurnZero, or a homegrown system. Triggers on "health score", "health score formula", "redesign our health score".
---

# Health Score Builder

Most health scores are useless because they average everything into a number that doesn't drive action. A good health score is calibrated against actual outcomes (did this customer renew? did they expand?) and lets a CSM know which signal to act on, not just "you're red."

## Required inputs

1. **Customer segment** — SMB / mid-market / enterprise (different signals matter)
2. **Available data sources** — product usage (which events?), support (Zendesk / Intercom volume + CSAT), NPS / surveys, billing, exec engagement tracking
3. **5-10 known-good customers** (renewed and expanded) and **5-10 known-bad** (churned or contracted)
4. **Time horizon** — score for 30-day risk? 90-day? renewal?
5. **Existing formula if any** — what's broken about it

If user has no known-good / known-bad list, push back. You cannot calibrate without ground truth.

## Design framework

### Step 1: Identify the 4-6 signals that actually matter

Not 20. Not 3. Around 5. For each, define:
- **What it measures** (concrete event or metric)
- **Why it predicts** renewal or churn (your hypothesis)
- **Threshold for green / yellow / red**

Common high-signal candidates by segment:

**Mid-market / Enterprise**:
- Core feature adoption (specific features, not aggregate logins)
- Exec sponsor engagement (last touchpoint with VP+ stakeholder, in days)
- Multi-threading (number of distinct active users from customer)
- Support sentiment (P1 frequency + CSAT)
- Time-to-value milestone hit (did they reach the activation moment?)

**SMB**:
- Login recency (more useful here than at enterprise)
- Single-threaded risk (one user = high risk)
- Support volume normalized by ARR

### Step 2: Weight the signals

NOT equal weights. Typically:
- Exec sponsor engagement: 25-30%
- Core feature adoption: 25-30%
- Multi-threading: 15-20%
- Support / sentiment: 10-15%
- Other (NPS, billing, etc.): 10-15%

Calibrate weights against your known-good vs known-bad list. If a signal has the same value for both, it's useless — drop or replace.

### Step 3: Define thresholds (per signal)

Don't use a 0-100 score. Use **green / yellow / red** per signal. Examples:

**Exec sponsor engagement**:
- Green: touchpoint with VP+ in last 60 days
- Yellow: 60-120 days
- Red: 120+ days, or sponsor changed

**Core feature adoption**:
- Green: weekly active in 3+ core features
- Yellow: weekly active in 1-2
- Red: <50% MoM in any core feature, or zero use of feature within 90 days of contract start

### Step 4: Aggregate signal — but preserve detail

Top-line score (Green / Yellow / Red overall) PLUS the per-signal status. The CSM needs to know not just "yellow" but "yellow because exec sponsor stale and core feature usage decaying — multi-threading and support are fine."

### Step 5: Calibrate

Run the formula against known-good and known-bad customers from 6-12 months ago.
- Did churned customers score Red 60+ days before churn?
- Did renewed customers score Green or Yellow consistently?
- If signals don't separate the two groups, redesign.

## Output

### The formula
4-6 signals, weights, thresholds, aggregate logic.

### Calibration table
| Customer | Outcome (renewed/churned/expanded) | Score 90 days before outcome | Did formula predict? |

### Action playbook tied to score
- Red overall: CSM intervention within 7 days, escalate to VP CS
- Red on exec engagement: AE + CSM joint outreach in 14 days
- Red on usage: trigger a usage audit + champion conversation
- Yellow: monitor, no auto-trigger

### Known limits
What this score WON'T catch:
- Acquisitions / external events
- Champion change inside green-light period
- Competitive replacement before signals show

Document these. Score should drive action; humans handle the rest.

## Anti-patterns

- Averaging 12 signals into one number that obscures which is failing
- Scoring everything 0-100 with no calibration against actual renewal data
- Using NPS as a primary signal — it correlates poorly with B2B renewal
- "Logins" as a feature signal — logins ≠ value
- Scoring without exec sponsor data when you have it

## Tone

Mechanical and skeptical. A health score is a model; treat it like one. If a signal doesn't separate winners from losers, kill it.

Example prompts

Once installed, try these prompts in Claude:

Build a health score for our $50k-$500k mid-market segment. We have product usage data (logins, 3 core feature usage), Zendesk tickets, NPS quarterly, and exec sponsor mapping. Help me design the formula.
Our current health score is just "logins last 30 days" averaged with NPS. It doesn't predict churn. Help me redesign it.