chore(skills): add /research and /prototype; rewrite README for 6-skill chain

Front-loads gap discovery before /grill-me by adding two skills: research MOSTLY fans out Explore + WebSearch agents in parallel, synthesizes findings into a brief, names open unknowns explicitly (which become /prototype targets) prototype MOSTLY builds a throwaway spike to test ONE falsifiable assumption; code lives in .prototypes/ (gitignored), never promoted; output is evidence — verdict, numbers, observed behavior — that feeds /prd Full chain now: /research → /prototype → /grill-me → /prd → /prd-to-issues → /ship-it Chain rationale: /research and /prototype surface knowledge gaps and falsify risky assumptions while the cost of changing direction is still cheap; the TOGETHER phases (grill-me, prd) lock down the contract; the AFK phase (ship-it) only executes against contracts already on paper. The chain is a default, not a mandate — README covers when to skip upstream skills for small or stack-familiar work. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 16:33:26 +02:00
parent 6ff262e96e
commit fcaad8cd9f
3 changed files with 265 additions and 93 deletions
--- a/.claude/skills/prototype/SKILL.md
+++ b/.claude/skills/prototype/SKILL.md
@@ -0,0 +1,65 @@
+---
+name: prototype
+description: Build a throwaway spike to falsify or confirm a single risky assumption. Code lives in .prototypes/ (gitignored) and is never promoted to the main codebase. Reports findings as evidence that feeds /prd. Use when the user invokes /prototype, says "spike X", "throwaway test for Y", "can we actually do Z" — typically after /research surfaces an Open Unknown.
+---
+
+# Prototype — throwaway spike
+
+**Mode: MOSTLY TOGETHER.** The build and run go AFK, but the user picks the assumption to test and decides what the findings mean. The output is *evidence*, not production code.
+
+You are now an engineer running a time-boxed spike to learn one thing. The point is to falsify or confirm an assumption fast — not to build a feature, not to produce code anyone will reuse.
+
+## Hard rules
+
+1. **One assumption per prototype.** If the user gives you two, ask which matters most; the other can be a second prototype.
+2. **The assumption must be falsifiable.** "Will it be fast?" → no. "Can Node-RED sustain 1k msg/s to InfluxDB on the dev VM for 10 min?" → yes. If the user's claim isn't falsifiable, refuse and ask for a sharper one before building anything.
+3. **Throwaway means throwaway.** Code lives in `<repo-root>/.prototypes/<short-name>/` only. The directory is gitignored (add it as the first step if it isn't). Nothing in `.prototypes/` is ever committed to the main codebase. No exceptions.
+4. **Time-box.** Default budget: 30 minutes of work and ≤200 LOC. If the user gave a different budget, use that. When you blow through, stop and report whatever you've got.
+
+## Steps
+
+1. **Restate the assumption** in falsifiable form. Show it to the user. Wait one turn for confirmation or correction — this is the only mid-skill checkpoint.
+2. **Pick the minimum viable test.** Options:
+   - **Code spike** — throwaway script that exercises the question. Most common.
+   - **Reading spike** — deep read of a library/spec/codebase, no code. Use when the question is "does X support Y" and the docs would tell you.
+   - **Manual integration spike** — run a command, hit an endpoint, observe. Use when the question is about a real service's behavior.
+3. **Set up the dir.**
+   ```bash
+   ROOT=$(git rev-parse --show-toplevel)
+   mkdir -p "$ROOT/.prototypes/<name>"
+   grep -qxE '\.prototypes/?' "$ROOT/.gitignore" 2>/dev/null || echo '.prototypes/' >> "$ROOT/.gitignore"
+   ```
+4. **Build the smallest thing that tests the assumption.** Resist polish. No tests on the prototype itself, no error handling, no docs, no abstractions. Hardcode values. Inline everything.
+5. **Run it.** Capture output. If it crashes in a way that's *about* the assumption (e.g. memory blows up at 1k msg/s), that's a finding — not a bug to fix.
+6. **Iterate up to the budget.** If a quick adjustment sharpens the test, make it. If you're tempted to refactor or expand scope, stop and report instead.
+7. **Report findings.** In chat, using this structure:
+
+```
+# Prototype findings: <assumption>
+
+**Verdict:** confirmed | falsified | inconclusive
+**Budget used:** <e.g. 22 min, 140 LOC>
+
+## What I did
+<2–3 sentences. What the spike actually exercised.>
+
+## Evidence
+<concrete output, numbers, logs, observed behavior. Paste the relevant snippet.>
+
+## What this changes in our mental model
+<one paragraph — what we believed before vs. what we believe now>
+
+## Recommended next step
+<one sentence — usually /prd, sometimes another /prototype, sometimes "kill this idea">
+
+## Prototype location (do not import)
+.prototypes/<name>/
+```
+
+## What not to do
+
+- **Don't promote the prototype.** Even if it works beautifully. The next phase is `/prd` → `/prd-to-issues` → `/ship-it` implementing the real thing in production code — not adapting the spike.
+- **Don't polish.** Tests, types, lint-clean, comments — none of it. The code is disposable.
+- **Don't expand scope.** "Since I'm here, I'll also test…" — no. File the second question for a separate prototype.
+- **Don't commit `.prototypes/`.** Ever. If you find yourself wanting to share the prototype, share the *findings*, not the code.
+- **Don't ask the user mid-build.** If the assumption was underspecified, you should have caught that in step 1. Once running, run.