Claude Skill

Browser Automation Driver

Drives a real browser from the CLI — navigate, fill forms, click, screenshot, scrape, and run end-to-end checks on a live web app.

Download skill (.zip)Or download whole pack

What it does

Wraps a real Chrome (via an agent-driven CLI over CDP) so Claude can open pages, interact with forms and buttons, capture screenshots, extract JS-rendered data, and exercise a running web app end-to-end. Built for verifying a change actually works in the browser, QA/dogfooding, and scraping pages that a plain HTTP fetch can't see.

When to use

✓Verifying a UI change works in a real browser, not just in unit tests
✓Scraping or extracting data that's injected by JavaScript (invisible to curl)
✓Automating a flow — login, fill a form, submit, screenshot the result
✓Exploratory QA / bug hunts on a running app

When not to use

✗A plain static HTTP request would get the data — use that, it's cheaper
✗Pure application logic with no browser surface involved

Install

Download the .zip, then unzip into your Claude skills folder.

mkdir -p ~/.claude/skills
unzip ~/Downloads/browser-automation-driver.zip -d ~/.claude/skills/

# Restart Claude Code session.
# Skill is now available — Claude will use it when relevant.

SKILL.md

---
name: browser-automation-driver
description: Use when a task needs a real browser — navigating pages, filling forms, clicking, screenshots, scraping JS-rendered content, or end-to-end testing a running web app. Triggers on "open this site", "fill the form", "screenshot", "scrape the rendered page", "test this flow".
---

# Browser Automation Driver

Use a real browser when the page only exists after JavaScript runs, when you need to prove a flow works by doing it, or when you're scraping content a static fetch can't see. Reach for a plain HTTP request first when that would actually return the data — the browser is the heavier tool.

## Setup

A CDP-driven browser CLI (e.g. `agent-browser`) gives you accessibility-tree snapshots and compact element references, which are far more reliable than CSS selectors that break on every redesign. Install it globally and let it manage Chrome; don't hand-roll Playwright unless the project already depends on it.

## Workflow

1. **Snapshot before you act.** Pull the accessibility tree first so you're targeting elements that actually exist, by stable ref — not by guessed selector.
2. **One action, then re-observe.** Navigate, then snapshot. Fill, then snapshot. Don't fire a blind sequence and hope; the DOM changes under you.
3. **Wait on state, not on time.** Wait for an element or a network-idle signal, not a fixed sleep.
4. **Capture evidence.** Screenshot at the moments that matter (after submit, on error) so the result is verifiable, not asserted.

## Good targets

- End-to-end checks: does signup → confirm → dashboard actually complete?
- JS-rendered scraping: pull data the server doesn't put in the initial HTML
- Authenticated flows: log in once, persist the session, then operate
- Exploratory QA: click through like a skeptical user and report what breaks

## Anti-patterns

- Targeting brittle CSS/XPath selectors instead of accessibility refs
- Fixed `sleep` calls instead of waiting on a condition
- Firing a long action chain with no snapshot between steps
- Using the browser for something a single HTTP GET would have answered
- Claiming a flow "works" without a screenshot or extracted value to prove it

Example prompts

Once installed, try these prompts in Claude:

Open our staging signup flow, fill the form with a test user, submit, and screenshot the result.
Scrape the pricing table from this page — the numbers are injected by JS so curl returns an empty table.

HN X LinkedIn Reddit