A Claude Code plugin for UX regression testing — evaluates whether people can understand, decide, act, and recover across real user journeys.
Warning
Status: Experimental. UX Audit is still evolving, and interfaces, outputs, checks, and workflows may change without notice.
- Reads your project — code, routes, docs, tests — and auto-generates journey scenarios specific to your product (typically 5–8+)
- Walks each journey on the live app — can someone sign up, reach first value, recover when things go wrong?
- Runs ~40 additional checks across AI-slop, accessibility, usability, core experience, and desirability
- Returns a ranked fix plan with screenshot evidence you can hand straight back to your coding agent
Every check cites a published source (WCAG 2.2, NN/g, Nielsen, Krug, Baymard, ISO 9241) — no vibes. Built for teams shipping fast with Claude Code and coding agents.
Actual run on a sample recipe app — journey scenarios were auto-generated, each walked on the live app:
Try it
# Inside Claude Code — URL is auto-detected from your project config
/uxaudit:uxaudit my-appQuick catalog-only check (skip journey evaluation):
/uxaudit:uxaudit my-app --skip core-experienceHost support
- Claude Code — first-class. Installed as a plugin, invoked as
/uxaudit:uxaudit. - Codex and other coding-agent hosts — not yet supported. The
skills/uxaudit/bundle is being designed to be portable, and broader host support is on the roadmap.
Maturity: experimental. Catalog checks, capture pipeline, dashboard, and history/compare views are used today; richer decide / recover evaluation and tighter host portability are still being built out. A full audit (catalog + journey evaluation) typically takes 30–60 minutes in Claude Code; Max plan is recommended for regular use.
uxaudit runs ~40 checks grouped into five categories. Every check is backed by a published source — no opinion-only findings.
| Category | What it checks | Example findings |
|---|---|---|
| ai-slop | Fingerprints of AI-generated UI | purple gradients, shadcn defaults everywhere, "coming soon" stubs, emoji-as-icon, generic hero copy |
| accessibility | WCAG 2.2 AA floor | contrast, tap targets, focus rings, keyboard reach, axe-core violations |
| usability | Nielsen heuristics & journey continuity | dead-end flows, missing empty states, weak error recovery, inconsistent feedback |
| core experience | Primary journey completion | can a new user actually sign up, create the first thing, and share it? |
| desirability | Visual craft, microcopy, first impression | typography drift, generic voice, weak aesthetic impression |
Full citation list: skills/uxaudit/references/knowledge/sources.md.
| Playwright / Cypress | axe / Lighthouse | uxaudit | |
|---|---|---|---|
| Does the user flow execute? | ✅ | — | — |
| WCAG AA violations? | — | ✅ | ✅ (floor) |
| Core Web Vitals / perf? | — | ✅ | — |
| Can a new user reach first value without friction? | — | — | ✅ |
| Dead-end / recovery / decision-cost checks? | — | — | ✅ |
| AI-slop & design-system drift detection? | — | — | ✅ |
| Ranked fix plan with evidence & history? | — | — | ✅ |
E2E tells you the flow runs. uxaudit tells you the flow makes sense while running. It is meant to sit between your test suite and manual review — most teams run it between "E2E passes" and "merge".
See Overview and positioning for the longer explanation.
The dashboard is designed to be actionable:
- Read the run through reader-facing buckets (
UX issues/UI risk signals) - Inspect the evidence behind each finding
- Decide which proposals to push into the next implementation loop
The report also makes it easy to inspect the difference between failures and healthy surfaces, so reviewers do not mistake "we found issues" for "everything is broken."
- Supported now: web apps, browser-like runtimes, and Electron
- Also includes: browser-based product surfaces and task-oriented websites
- Out of scope: pure native macOS/iOS/Android apps and CLIs
See Overview and positioning for the longer explanation of where uxaudit sits relative to E2E, accessibility checks, and manual QA.
- Claude Code CLI with plugin marketplace support (Max plan recommended)
- Node.js 20+ and Python 3.10+
- ~300 MB disk for Playwright Chromium on first install
Journey checks use Playwright first and escalate to computer-use only when the flow temporarily crosses outside Playwright's control (file pickers, OS dialogs, external auth). Verdicts are based on visible step-by-step state changes, not DOM inspection.
Install uxaudit from this repository's Claude Code marketplace:
# 1. Add the marketplace from GitHub
claude plugin marketplace add gotalab/uxaudit
# 2. Install the plugin from that marketplace
claude plugin install uxaudit@gotalab-uxaudit
# 3. Start Claude Code normally
claudeFor a shared repository setup:
claude plugin marketplace add gotalab/uxaudit
claude plugin install uxaudit@gotalab-uxaudit --scope projectFrom inside an active Claude Code session:
/plugin marketplace add gotalab/uxaudit
/plugin install uxaudit@gotalab-uxaudit
/reload-plugins
On first use, the plugin installs Playwright Chromium if needed. Verify it loaded with /plugin. The skill is invoked as /uxaudit:uxaudit.
# Run the audit — URL is auto-detected, dev server is started if needed
/uxaudit:uxaudit my-app
# Or specify the URL explicitly
/uxaudit:uxaudit my-app --url http://localhost:3000Other modes:
/uxaudit:uxaudit my-app --lang ja # Japanese output
/uxaudit:uxaudit my-app --viewport mobile # responsive (mobile/tablet/desktop-lg/etc.)
/uxaudit:uxaudit my-app --viewports mobile,tablet,desktop # multi-viewport in one run
/uxaudit:uxaudit my-app --electron-app ./apps/desktop/main.js
/uxaudit:uxaudit my-app --only ai-slop,usability # category filter
/uxaudit:uxaudit my-app --skip core-experience # skip journeysFull viewport presets and argument reference: skills/uxaudit/SKILL.md.
After multiple audit iterations you can view the timeline, compare two workspaces, or see the cross-project library:
# From the skill bundle
python $UXAUDIT_DIR/scripts/generate_dashboard.py <workspace> --timeline
python $UXAUDIT_DIR/scripts/generate_dashboard.py <ws-a> <ws-b> --compare
python $UXAUDIT_DIR/scripts/generate_dashboard.py --library- Overview and positioning
- Maturity, workflow, and roadmap
- Skill runtime reference
- Check sources and citations
Bug reports and concrete UX-pattern observations (especially "I see this in ≥3 real projects") are welcome via GitHub issues.




