chore(skills): add /research and /prototype; rewrite README for 6-skill chain
Front-loads gap discovery before /grill-me by adding two skills:
research MOSTLY fans out Explore + WebSearch agents in parallel,
synthesizes findings into a brief, names open
unknowns explicitly (which become /prototype targets)
prototype MOSTLY builds a throwaway spike to test ONE falsifiable
assumption; code lives in .prototypes/ (gitignored),
never promoted; output is evidence — verdict, numbers,
observed behavior — that feeds /prd
Full chain now:
/research → /prototype → /grill-me → /prd → /prd-to-issues → /ship-it
Chain rationale: /research and /prototype surface knowledge gaps and falsify
risky assumptions while the cost of changing direction is still cheap; the
TOGETHER phases (grill-me, prd) lock down the contract; the AFK phase
(ship-it) only executes against contracts already on paper.
The chain is a default, not a mandate — README covers when to skip
upstream skills for small or stack-familiar work.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -1,132 +1,127 @@
|
|||||||
# Workflow skills — grill-me → prd → prd-to-issues → ship-it
|
# Workflow skills — research → prototype → grill-me → prd → prd-to-issues → ship-it
|
||||||
|
|
||||||
A four-skill chain that takes a vague feature idea from "I want to build X" to merged, end-to-end-verified code. Two phases are collaborative (you + Claude), two run autonomously (AFK).
|
A six-skill chain that takes a vague idea from "I wonder if we could…" to merged, end-to-end-verified code. Four collaborative phases up front to lock down what's being built; two largely-autonomous phases that execute against that contract.
|
||||||
|
|
||||||
```
|
```
|
||||||
/grill-me <topic> TOGETHER pressure-test the idea
|
/research <topic> MOSTLY external + repo knowledge into a brief
|
||||||
↓
|
↓
|
||||||
/prd TOGETHER synthesize a PRD; gaps stay explicit
|
/prototype <claim> MOSTLY throwaway spike to test the riskiest assumption
|
||||||
↓
|
↓
|
||||||
/prd-to-issues MOSTLY thin vertical-slice issues; file when you say "create"
|
/grill-me <topic> TOGETHER pressure-test what survived
|
||||||
|
↓
|
||||||
|
/prd TOGETHER synthesize PRD; gaps stay explicit
|
||||||
|
↓
|
||||||
|
/prd-to-issues MOSTLY thin vertical-slice issues; file on "create"
|
||||||
↓
|
↓
|
||||||
/ship-it AFK shell loop ships every slice end-to-end
|
/ship-it AFK shell loop ships every slice end-to-end
|
||||||
```
|
```
|
||||||
|
|
||||||
## The TOGETHER vs AFK distinction
|
You don't have to use every skill on every feature. Small tweaks may skip `/research` and `/prototype`. Bigger / novel work uses the whole chain.
|
||||||
|
|
||||||
|
## Mode taxonomy
|
||||||
|
|
||||||
| Mode | Meaning | Skills |
|
| Mode | Meaning | Skills |
|
||||||
|---|---|---|
|
|---|---|---|
|
||||||
| **TOGETHER** | Needs your judgment every turn. No autonomous path. | `grill-me` |
|
| **TOGETHER** | Needs your turn-by-turn judgment. No autonomous path. | grill-me |
|
||||||
| **MOSTLY TOGETHER** | Drafts/audits AFK, but a visible-to-team action needs your "go". | `prd`, `prd-to-issues` |
|
| **MOSTLY TOGETHER** | Drafts / fetches / builds AFK. You review the output. Any visible-to-team action needs your explicit "go". | research, prototype, prd, prd-to-issues |
|
||||||
| **AFK** | No human in the loop. Logs questions to issues instead of asking. | `ship-it` |
|
| **AFK** | No human in the loop. Logs questions to issues instead of asking. | ship-it |
|
||||||
|
|
||||||
Each skill's `SKILL.md` declares its mode in the first body line. The chain is designed so the AFK phase only starts after the human-locked phases have nailed down the contract (PRD requirements, slice acceptance criteria) — AFK code only ships against decisions that already exist on paper.
|
The chain is structured so AFK execution only starts after the human-locked phases have nailed down the contract. Autonomous code never runs against undefined contracts.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## When to use each
|
## When to use each
|
||||||
|
|
||||||
|
### `/research <topic>` — MOSTLY TOGETHER
|
||||||
|
Fans out Explore + WebSearch agents in parallel, synthesizes findings into a research brief, and names open unknowns explicitly (which become candidates for `/prototype`).
|
||||||
|
|
||||||
|
**Use when:** the topic touches anything you haven't done before in this codebase. Novel libraries, unfamiliar patterns, "how do others solve this".
|
||||||
|
|
||||||
|
**Don't use it for:** stuff you already understand. The point is to fetch what you don't know, not summarize what you do.
|
||||||
|
|
||||||
|
### `/prototype <claim>` — MOSTLY TOGETHER
|
||||||
|
Builds a throwaway spike to test ONE falsifiable assumption. Code lives in `.prototypes/` (gitignored) and is never promoted to the main codebase. Output is evidence — verdict, numbers, observed behavior — that feeds the PRD.
|
||||||
|
|
||||||
|
**Use when:** `/research` surfaced an Open Unknown that "we'll find out when we build it". Better to find out for an hour of spike cost than a week of half-built feature.
|
||||||
|
|
||||||
|
**Don't use it for:** building a "lightweight v0" you secretly plan to evolve. Prototypes are evidence; production code is the real implementation. The skill rejects scope creep mid-spike.
|
||||||
|
|
||||||
### `/grill-me <topic>` — TOGETHER
|
### `/grill-me <topic>` — TOGETHER
|
||||||
|
Senior staff engineer running a brutal-but-fair interview. One hard question at a time, honest critique (no praise filler), drills into weak spots. Stay on topic until exhausted; say `stop` for an honest 3-bullet debrief.
|
||||||
|
|
||||||
**Use when:** you have a fuzzy idea and want it pressure-tested before committing to scope. Also useful for interview prep on any technical topic.
|
**Use when:** `/research` and `/prototype` (if used) have built up enough context that you need to pressure-test your own thinking before locking it in a PRD.
|
||||||
|
|
||||||
**Behavior:** acts as a senior staff engineer running a brutal-but-fair interview. One hard question at a time, honest critique (no praise filler), drills into weak spots, demands specifics over buzzwords. Stay on the topic until exhausted; say `stop` for an honest 3-bullet debrief.
|
**Don't use it for:** rubber-stamping a finished idea, or when you want validation. Designed to find gaps, not to agree.
|
||||||
|
|
||||||
**Don't use it for:** rubber-stamping a finished idea, or when you want validation. It's designed to find gaps, not to agree with you.
|
|
||||||
|
|
||||||
### `/prd` — TOGETHER (drafts AFK after grilling)
|
### `/prd` — TOGETHER (drafts AFK after grilling)
|
||||||
|
Engineering PRD: Problem, Goals, Non-goals, Users & scenarios, Functional + Non-functional requirements, Constraints, Success metrics, Open Questions, Out of scope. Things you nailed in grilling become firm requirements. Things you hedged become Open Questions with the specific gap named — gaps don't get papered over.
|
||||||
|
|
||||||
**Use when:** the grilling exposed enough that you're ready to lock down what's being built. Or standalone when you already know the feature shape.
|
**Use when:** the grilling exposed enough that the feature shape is clear. Or standalone when you already have full context.
|
||||||
|
|
||||||
**Behavior:** writes an engineering PRD (Problem, Goals, Non-goals, Users & scenarios, Functional + Non-functional requirements, Constraints, Success metrics, Open Questions, Out of scope). Things you nailed in the grilling become firm requirements. Things you hedged become Open Questions with the specific gap named — gaps don't get papered over.
|
**Don't use it for:** strategy decks, market sizing, "why now". This is for engineering.
|
||||||
|
|
||||||
**Output:** inline by default. Say "save it" → writes to `docs/prd/<name>.md`.
|
|
||||||
|
|
||||||
**Don't use it for:** strategy docs, market sizing, or "why now" sections. This is for engineering.
|
|
||||||
|
|
||||||
### `/prd-to-issues` — MOSTLY TOGETHER
|
### `/prd-to-issues` — MOSTLY TOGETHER
|
||||||
|
Breaks the PRD into **thin vertical slices** — each issue cuts end-to-end through every integration layer (schema → service → API → UI → tests; or sensor → broker → parser → store → dashboard). First slice is a walking skeleton. Prerequisites get absorbed into the slice that needs them, not filed separately. Per-issue `Slice check` block proves every layer is covered, plus a coverage matrix at the top of the draft showing PRD → issue mapping. Self-audit runs **before** the draft is shown to you.
|
||||||
|
|
||||||
**Use when:** you have a PRD (just-drafted or path-pointed) and need a backlog of issues a team can pick up.
|
**Output:** draft inline → you reply `create` → files to the tracker (`gh` for GitHub, `tea` for Gitea).
|
||||||
|
|
||||||
**Behavior:** breaks the PRD into **thin vertical slices** — each issue cuts end-to-end through every integration layer (schema → service → API → UI → tests; or sensor → broker → parser → store → dashboard). The first slice is a walking skeleton. Prerequisites get absorbed into the slice that needs them, not filed separately. Each issue has a visible `Slice check` block proving every layer is covered, plus a coverage matrix at the top of the draft showing PRD → issue mapping. Self-audit runs **before** drafting is shown to you.
|
**Don't use it for:** horizontal task lists ("DB work", "API work", "frontend work"). The skill rejects layer-cake slicing.
|
||||||
|
|
||||||
**Output:** draft inline → you reply `create` → it files to the tracker (`gh` for GitHub, `tea` for Gitea).
|
|
||||||
|
|
||||||
**Don't use it for:** horizontal task lists ("DB work", "API work", "frontend work"). The skill rejects layer-cake slicing — if your work needs that, you want a different tool.
|
|
||||||
|
|
||||||
### `/ship-it` — AFK
|
### `/ship-it` — AFK
|
||||||
|
Shell loop in `.claude/skills/ship-it/loop.sh`. Picks the next ready issue, dispatches a fresh headless Claude to ship it end-to-end (failing e2e test first → implement layer by layer → full suite → outermost-layer smoke check → commit `Closes #N` → PR with acceptance-criteria checkboxes + smoke evidence → CI gate → merge or leave-for-review), then moves on. One commit per issue. Status streams to terminal; tail logs from another shell; Ctrl-C anytime.
|
||||||
**Use when:** issues are filed, you want to walk away, you'll review the PRs later.
|
|
||||||
|
|
||||||
**Behavior:** runs a shell loop that picks the next ready issue, dispatches a fresh headless Claude to ship it end-to-end (failing e2e test first → implement layer by layer → full suite → outermost-layer smoke check → commit `Closes #N` → PR with checked acceptance criteria + smoke evidence → CI gate → merge or leave-for-review), then moves on. One commit per issue. Status streams to the terminal; you can `tail -f` the log from another shell or Ctrl-C anytime.
|
|
||||||
|
|
||||||
Undecidable issues get labeled `needs-decision` and skipped. Three consecutive failures stops the loop for human review.
|
Undecidable issues get labeled `needs-decision` and skipped. Three consecutive failures stops the loop for human review.
|
||||||
|
|
||||||
**Don't use it for:** issues whose acceptance criteria aren't testable (the loop will skip them), or for slices that need real-world side effects with no test harness.
|
**Don't use it for:** issues whose acceptance criteria aren't testable. The loop will skip them.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## A worked example
|
## A worked example
|
||||||
|
|
||||||
You want to add live sensor display for a new flow meter on a dashboard.
|
Adding live sensor display for a new flow meter to the operator dashboard.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# 1. Pressure-test the idea
|
# 1. Fetch what we don't already know
|
||||||
|
/research adding live flow-meter readings to the operator dashboard
|
||||||
|
# Brief lands; surfaces an Open Unknown:
|
||||||
|
# "Can Node-RED sustain 1Hz updates across 12 dashboard panels for 10 min
|
||||||
|
# straight without dropping frames?"
|
||||||
|
|
||||||
|
# 2. Test the risky assumption
|
||||||
|
/prototype Node-RED can stream 1Hz updates to 12 Grafana panels for 10 min straight
|
||||||
|
# Spike runs in .prototypes/nodered-throughput/;
|
||||||
|
# Verdict: confirmed, 14% CPU peak. Evidence captured. Prototype stays gitignored.
|
||||||
|
|
||||||
|
# 3. Pressure-test the design
|
||||||
/grill-me adding live flow-meter readings to the operator dashboard
|
/grill-me adding live flow-meter readings to the operator dashboard
|
||||||
|
# 6–8 hard questions; surfaces gaps in alerting and missing-data handling.
|
||||||
|
|
||||||
# (you answer 6-8 hard questions; gaps surface)
|
# 4. Lock down the contract
|
||||||
|
|
||||||
# 2. Lock down the contract
|
|
||||||
/prd
|
/prd
|
||||||
|
# PRD drafts with the alerting decision as a firm requirement and the
|
||||||
|
# missing-data behavior as an explicit Open Question.
|
||||||
|
|
||||||
# (PRD drafts; you edit one section; save it)
|
# 5. Slice it
|
||||||
|
|
||||||
# 3. Slice it
|
|
||||||
/prd-to-issues
|
/prd-to-issues
|
||||||
|
# 5 slices; coverage matrix confirms every PRD requirement maps to a slice.
|
||||||
|
# Reply `create` → issues #142..#146 filed.
|
||||||
|
|
||||||
# (draft shows ~5 slices, each with a Slice check block. Coverage matrix
|
# 6. Walk away
|
||||||
# confirms every PRD requirement maps to a slice. Reply `create`.)
|
|
||||||
# → issues #142..#146 filed
|
|
||||||
|
|
||||||
# 4. Walk away
|
|
||||||
/ship-it
|
/ship-it
|
||||||
|
# Preflight, plan, "Start? Reply `go`." → `go` → shell loop runs.
|
||||||
# (preflight, plan, "Start? Reply `go`." → `go` → shell loop runs)
|
# Another terminal: tail -f .ship-it-logs/run-*.log
|
||||||
# In another terminal: tail -f .ship-it-logs/run-*.log
|
|
||||||
```
|
```
|
||||||
|
|
||||||
After the loop exits, the summary tells you what shipped, what's open for review, what hit `needs-decision`.
|
After ship-it exits, the summary tells you what shipped, what's open for review, what hit `needs-decision`.
|
||||||
|
|
||||||
---
|
## Skipping skills
|
||||||
|
|
||||||
## Configuration
|
The chain is a default, not a mandate:
|
||||||
|
|
||||||
### Repo trunk branch
|
- **Tiny well-understood change:** straight to `/prd-to-issues` (or file the issue by hand and run `/ship-it`).
|
||||||
|
- **Bigger but stack-familiar:** skip `/research` and `/prototype`; start at `/grill-me`.
|
||||||
`ship-it` defaults to `main`. If your repo uses something else:
|
- **Pure research, no implementation yet:** stop after `/research` or `/prototype` — the brief or findings are the deliverable.
|
||||||
|
- **Existing PRD from somewhere else:** `/prd-to-issues <path>` and go.
|
||||||
```bash
|
|
||||||
SHIP_IT_TRUNK=development bash .claude/skills/ship-it/loop.sh
|
|
||||||
```
|
|
||||||
|
|
||||||
### Other env vars
|
|
||||||
|
|
||||||
| Var | Default | Purpose |
|
|
||||||
|---|---|---|
|
|
||||||
| `SHIP_IT_MAX` | 50 | Iteration cap |
|
|
||||||
| `SHIP_IT_MAX_FAIL` | 3 | Consecutive failures before stop |
|
|
||||||
| `SHIP_IT_TIMEOUT` | 30m | Per-issue timeout |
|
|
||||||
| `SHIP_IT_LOG_DIR` | `<repo>/.ship-it-logs` | Log directory |
|
|
||||||
|
|
||||||
### Tracker support
|
|
||||||
|
|
||||||
- **GitHub** — uses `gh` CLI (must be installed and authenticated: `gh auth status`).
|
|
||||||
- **Gitea** — uses `tea` CLI (must be installed: `go install code.gitea.io/tea@latest`, then `tea login add`).
|
|
||||||
- Auto-detected from `git remote get-url origin`.
|
|
||||||
|
|
||||||
### Issue label expected by `ship-it`
|
|
||||||
|
|
||||||
The loop filters to open issues with label `slice` and without labels `blocked`, `needs-decision`, or `ci-failed`. `/prd-to-issues` applies the `slice` label by default. If you file issues by hand, add the label or `ship-it` won't pick them up.
|
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@@ -135,6 +130,10 @@ The loop filters to open issues with label `slice` and without labels `blocked`,
|
|||||||
```
|
```
|
||||||
.claude/skills/
|
.claude/skills/
|
||||||
├── README.md ← this file
|
├── README.md ← this file
|
||||||
|
├── research/
|
||||||
|
│ └── SKILL.md
|
||||||
|
├── prototype/
|
||||||
|
│ └── SKILL.md
|
||||||
├── grill-me/
|
├── grill-me/
|
||||||
│ └── SKILL.md
|
│ └── SKILL.md
|
||||||
├── prd/
|
├── prd/
|
||||||
@@ -145,36 +144,74 @@ The loop filters to open issues with label `slice` and without labels `blocked`,
|
|||||||
├── SKILL.md ← entry point; chat-side bootstrap
|
├── SKILL.md ← entry point; chat-side bootstrap
|
||||||
├── loop.sh ← orchestrator (the actual loop)
|
├── loop.sh ← orchestrator (the actual loop)
|
||||||
└── iterate.md ← per-issue prompt the loop dispatches
|
└── iterate.md ← per-issue prompt the loop dispatches
|
||||||
|
|
||||||
|
.prototypes/ ← throwaway spike code (gitignored, created by /prototype)
|
||||||
|
.ship-it-logs/ ← ship-it loop logs (recommend gitignoring)
|
||||||
|
docs/
|
||||||
|
├── research/ ← saved research briefs ("save it" in /research)
|
||||||
|
└── prd/ ← saved PRDs ("save it" in /prd)
|
||||||
```
|
```
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
## Configuration
|
||||||
|
|
||||||
|
### ship-it tracker support
|
||||||
|
- **GitHub** — `gh` CLI required (`gh auth status`)
|
||||||
|
- **Gitea** — `tea` CLI required (`go install code.gitea.io/tea@latest && tea login add`)
|
||||||
|
- Auto-detected from `git remote get-url origin`
|
||||||
|
|
||||||
|
### ship-it env vars
|
||||||
|
|
||||||
|
| Var | Default | Purpose |
|
||||||
|
|---|---|---|
|
||||||
|
| `SHIP_IT_TRUNK` | `main` | Trunk branch (set to `development` for the EVOLV repo) |
|
||||||
|
| `SHIP_IT_MAX` | 50 | Iteration cap |
|
||||||
|
| `SHIP_IT_MAX_FAIL` | 3 | Consecutive failures before stop |
|
||||||
|
| `SHIP_IT_TIMEOUT` | 30m | Per-issue timeout |
|
||||||
|
| `SHIP_IT_LOG_DIR` | `<repo>/.ship-it-logs` | Log directory |
|
||||||
|
|
||||||
|
Example for EVOLV:
|
||||||
|
```bash
|
||||||
|
SHIP_IT_TRUNK=development bash .claude/skills/ship-it/loop.sh
|
||||||
|
```
|
||||||
|
|
||||||
|
### Issue label expected by ship-it
|
||||||
|
The loop filters to open issues with label `slice` and without `blocked`, `needs-decision`, or `ci-failed`. `/prd-to-issues` applies `slice` by default. If you file issues by hand, add the label or ship-it won't pick them up.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
## Troubleshooting
|
## Troubleshooting
|
||||||
|
|
||||||
**`ship-it` won't start: "tea CLI not installed".**
|
**`ship-it` won't start: "tea CLI not installed".**
|
||||||
The repo's remote is Gitea but you don't have `tea`. Install it (`go install code.gitea.io/tea@latest && tea login add`) or push to a GitHub mirror.
|
Repo remote is Gitea but you don't have `tea`. Install it (`go install code.gitea.io/tea@latest && tea login add`) or push to a GitHub mirror.
|
||||||
|
|
||||||
**`ship-it` exits immediately: "git tree is dirty".**
|
**`ship-it` exits immediately: "git tree is dirty".**
|
||||||
Commit or stash everything before running. The loop won't risk mixing your work-in-progress into a slice.
|
Commit or stash before running. The loop won't risk mixing WIP into a slice.
|
||||||
|
|
||||||
**`ship-it` says "backlog empty" but I have open issues.**
|
**`ship-it` says "backlog empty" but I have open issues.**
|
||||||
The filter requires label `slice` AND no `blocked`/`needs-decision`/`ci-failed`. Add the label, or check what's blocking.
|
Filter requires label `slice` AND none of `blocked` / `needs-decision` / `ci-failed`. Check labels.
|
||||||
|
|
||||||
**An issue keeps getting `needs-decision`.**
|
**An issue keeps getting `needs-decision`.**
|
||||||
Its acceptance criteria probably aren't testable at the outermost layer. Open the issue, rewrite criteria so they're observable (e.g. "POST /x returns 201 and row appears on dashboard" not "feature works"), remove the label, the loop will pick it up next run.
|
Acceptance criteria probably aren't testable at the outermost layer. Rewrite as observable (e.g. "POST /x returns 201 and row appears on dashboard"), drop the label, rerun.
|
||||||
|
|
||||||
**Three failures in a row, loop stopped.**
|
**`/prototype` keeps wanting to "tidy up" the spike before reporting.**
|
||||||
Something systemic — bad dependency, branch protection blocking, flaky test env. Check `.ship-it-logs/iter-*.log` for the per-issue detail.
|
That's a sign the assumption isn't sharp enough — Claude is filling time. Sharpen the assumption and rerun, or just say "stop, report what you have now."
|
||||||
|
|
||||||
**Issue had a PR opened but didn't merge.**
|
**`/research` returns shallow results.**
|
||||||
Branch protection requires human review. The loop reports it as "shipped → open for review" and moves on. Review and merge when you can.
|
The decomposed questions were too broad. Ask it to redo with a tighter scope, or constrain ("only this repo" / "only Node-RED + InfluxDB stack").
|
||||||
|
|
||||||
|
**`/prd-to-issues` drafts look like layer cake.**
|
||||||
|
Stop, say "reslice — these are horizontal." The skill's self-audit should catch this, but if it doesn't, push back explicitly.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Design principles
|
## Design principles
|
||||||
|
|
||||||
- **Vertical slices, always.** No "implement the backend first, then the frontend". Every issue exercises every layer.
|
- **Front-load gap discovery.** Research, prototype, grill-me, PRD — each phase exists to surface gaps before they cost real implementation time.
|
||||||
- **Gaps are explicit, never hidden.** Hedged answers in grilling → Open Questions in PRD → `needs-decision` issues, not silent assumptions.
|
- **Gaps are explicit, never hidden.** Open Unknowns in `/research` → spike claims in `/prototype` → Open Questions in PRD → `needs-decision` labels on issues. Nothing gets papered over.
|
||||||
- **AFK only after the contract is locked.** Autonomous code only ships against decisions that already exist on paper.
|
- **Vertical slices, always.** No "implement the backend first". Every slice exercises every layer.
|
||||||
|
- **AFK only after the contract is locked.** Autonomous code only runs against decisions already on paper.
|
||||||
|
- **Throwaway means throwaway.** Prototypes are evidence; the real implementation in production code happens fresh in `/ship-it`.
|
||||||
|
- **Outermost-layer verification.** "Tests pass" isn't enough — the loop confirms user-observable behavior actually works before reporting shipped.
|
||||||
- **One commit per slice.** Small, reviewable, revertible.
|
- **One commit per slice.** Small, reviewable, revertible.
|
||||||
- **Outermost-layer verification.** "Tests pass" isn't enough — the loop confirms the user-observable behavior actually works before reporting shipped.
|
|
||||||
|
|||||||
65
.claude/skills/prototype/SKILL.md
Normal file
65
.claude/skills/prototype/SKILL.md
Normal file
@@ -0,0 +1,65 @@
|
|||||||
|
---
|
||||||
|
name: prototype
|
||||||
|
description: Build a throwaway spike to falsify or confirm a single risky assumption. Code lives in .prototypes/ (gitignored) and is never promoted to the main codebase. Reports findings as evidence that feeds /prd. Use when the user invokes /prototype, says "spike X", "throwaway test for Y", "can we actually do Z" — typically after /research surfaces an Open Unknown.
|
||||||
|
---
|
||||||
|
|
||||||
|
# Prototype — throwaway spike
|
||||||
|
|
||||||
|
**Mode: MOSTLY TOGETHER.** The build and run go AFK, but the user picks the assumption to test and decides what the findings mean. The output is *evidence*, not production code.
|
||||||
|
|
||||||
|
You are now an engineer running a time-boxed spike to learn one thing. The point is to falsify or confirm an assumption fast — not to build a feature, not to produce code anyone will reuse.
|
||||||
|
|
||||||
|
## Hard rules
|
||||||
|
|
||||||
|
1. **One assumption per prototype.** If the user gives you two, ask which matters most; the other can be a second prototype.
|
||||||
|
2. **The assumption must be falsifiable.** "Will it be fast?" → no. "Can Node-RED sustain 1k msg/s to InfluxDB on the dev VM for 10 min?" → yes. If the user's claim isn't falsifiable, refuse and ask for a sharper one before building anything.
|
||||||
|
3. **Throwaway means throwaway.** Code lives in `<repo-root>/.prototypes/<short-name>/` only. The directory is gitignored (add it as the first step if it isn't). Nothing in `.prototypes/` is ever committed to the main codebase. No exceptions.
|
||||||
|
4. **Time-box.** Default budget: 30 minutes of work and ≤200 LOC. If the user gave a different budget, use that. When you blow through, stop and report whatever you've got.
|
||||||
|
|
||||||
|
## Steps
|
||||||
|
|
||||||
|
1. **Restate the assumption** in falsifiable form. Show it to the user. Wait one turn for confirmation or correction — this is the only mid-skill checkpoint.
|
||||||
|
2. **Pick the minimum viable test.** Options:
|
||||||
|
- **Code spike** — throwaway script that exercises the question. Most common.
|
||||||
|
- **Reading spike** — deep read of a library/spec/codebase, no code. Use when the question is "does X support Y" and the docs would tell you.
|
||||||
|
- **Manual integration spike** — run a command, hit an endpoint, observe. Use when the question is about a real service's behavior.
|
||||||
|
3. **Set up the dir.**
|
||||||
|
```bash
|
||||||
|
ROOT=$(git rev-parse --show-toplevel)
|
||||||
|
mkdir -p "$ROOT/.prototypes/<name>"
|
||||||
|
grep -qxE '\.prototypes/?' "$ROOT/.gitignore" 2>/dev/null || echo '.prototypes/' >> "$ROOT/.gitignore"
|
||||||
|
```
|
||||||
|
4. **Build the smallest thing that tests the assumption.** Resist polish. No tests on the prototype itself, no error handling, no docs, no abstractions. Hardcode values. Inline everything.
|
||||||
|
5. **Run it.** Capture output. If it crashes in a way that's *about* the assumption (e.g. memory blows up at 1k msg/s), that's a finding — not a bug to fix.
|
||||||
|
6. **Iterate up to the budget.** If a quick adjustment sharpens the test, make it. If you're tempted to refactor or expand scope, stop and report instead.
|
||||||
|
7. **Report findings.** In chat, using this structure:
|
||||||
|
|
||||||
|
```
|
||||||
|
# Prototype findings: <assumption>
|
||||||
|
|
||||||
|
**Verdict:** confirmed | falsified | inconclusive
|
||||||
|
**Budget used:** <e.g. 22 min, 140 LOC>
|
||||||
|
|
||||||
|
## What I did
|
||||||
|
<2–3 sentences. What the spike actually exercised.>
|
||||||
|
|
||||||
|
## Evidence
|
||||||
|
<concrete output, numbers, logs, observed behavior. Paste the relevant snippet.>
|
||||||
|
|
||||||
|
## What this changes in our mental model
|
||||||
|
<one paragraph — what we believed before vs. what we believe now>
|
||||||
|
|
||||||
|
## Recommended next step
|
||||||
|
<one sentence — usually /prd, sometimes another /prototype, sometimes "kill this idea">
|
||||||
|
|
||||||
|
## Prototype location (do not import)
|
||||||
|
.prototypes/<name>/
|
||||||
|
```
|
||||||
|
|
||||||
|
## What not to do
|
||||||
|
|
||||||
|
- **Don't promote the prototype.** Even if it works beautifully. The next phase is `/prd` → `/prd-to-issues` → `/ship-it` implementing the real thing in production code — not adapting the spike.
|
||||||
|
- **Don't polish.** Tests, types, lint-clean, comments — none of it. The code is disposable.
|
||||||
|
- **Don't expand scope.** "Since I'm here, I'll also test…" — no. File the second question for a separate prototype.
|
||||||
|
- **Don't commit `.prototypes/`.** Ever. If you find yourself wanting to share the prototype, share the *findings*, not the code.
|
||||||
|
- **Don't ask the user mid-build.** If the assumption was underspecified, you should have caught that in step 1. Once running, run.
|
||||||
70
.claude/skills/research/SKILL.md
Normal file
70
.claude/skills/research/SKILL.md
Normal file
@@ -0,0 +1,70 @@
|
|||||||
|
---
|
||||||
|
name: research
|
||||||
|
description: Gather external knowledge and codebase context for a topic before committing to a direction. Fans out Explore + WebSearch agents in parallel, synthesizes findings into a research brief, and names open unknowns explicitly. Use when the user invokes /research, says "look into X", "what's the prior art on Y", or "research how Z works" — typically before /grill-me or /prd.
|
||||||
|
---
|
||||||
|
|
||||||
|
# Research — fetch knowledge into a brief
|
||||||
|
|
||||||
|
**Mode: MOSTLY TOGETHER.** The fetching and synthesis run AFK (Agent subagents do the legwork). The brief lands in chat; you decide what's worth pursuing. No external state is changed.
|
||||||
|
|
||||||
|
You are now a senior engineer doing a focused research pass. Goal: enough knowledge to make a good `/prd` decision later — no more. Do not write code, do not pick a winner, do not write the PRD. Lay out what's known, what's available, and what's still unknown.
|
||||||
|
|
||||||
|
## Inputs
|
||||||
|
|
||||||
|
The user names a topic. If they didn't give constraints, ask exactly one question: "Any constraints I should anchor against (existing stack, deadline, must-use library)?" Then proceed.
|
||||||
|
|
||||||
|
## How to research
|
||||||
|
|
||||||
|
1. **Decompose the topic into 3–5 specific questions.** Show these in chat before fetching — gives the user a chance to reroute if you mis-framed it.
|
||||||
|
2. **Fan out in parallel** using the Agent tool. Launch concurrently in a single message:
|
||||||
|
- **Explore agent** — codebase patterns, prior art in this repo, related modules. Question: "Does this repo already do something like X? Where? What patterns does it use?"
|
||||||
|
- **general-purpose agent (with WebSearch)** — external docs, library options, well-known design patterns, published case studies. Question: "What are the established approaches to Y? What libraries handle Z?"
|
||||||
|
- Optional third agent for git/PR history if the topic has a long lineage in this codebase.
|
||||||
|
3. **Synthesize, don't dump.** When agents report back, write a brief — not a transcript.
|
||||||
|
|
||||||
|
## Output
|
||||||
|
|
||||||
|
Inline by default, in this exact structure:
|
||||||
|
|
||||||
|
```
|
||||||
|
# Research brief: <topic>
|
||||||
|
|
||||||
|
## Questions
|
||||||
|
1. <decomposed question>
|
||||||
|
2. ...
|
||||||
|
|
||||||
|
## What's already in this codebase
|
||||||
|
- <finding> (path/to/file.ts:42)
|
||||||
|
- ...
|
||||||
|
|
||||||
|
## External options
|
||||||
|
- **<option>** — <one-line eval. when it fits, when it doesn't>
|
||||||
|
- ...
|
||||||
|
|
||||||
|
## Prior art
|
||||||
|
- <link> — <one-line takeaway>
|
||||||
|
- ...
|
||||||
|
|
||||||
|
## Open unknowns
|
||||||
|
- <thing no source can answer; candidate for /prototype>
|
||||||
|
- ...
|
||||||
|
|
||||||
|
## Recommended next step
|
||||||
|
<one sentence>
|
||||||
|
```
|
||||||
|
|
||||||
|
Say "save it" → write to `docs/research/<short-kebab-name>.md`.
|
||||||
|
|
||||||
|
## Quality bar
|
||||||
|
|
||||||
|
- Specific over comprehensive. A 1-page brief that surfaces the real decision beats a 5-page survey.
|
||||||
|
- Cite sources for every claim. `file:line` for codebase, URL for external. No floating assertions.
|
||||||
|
- Name what you don't know. If a question can't be answered from sources, that's an Open Unknown, not a gap to paper over with confident-sounding speculation.
|
||||||
|
- Don't recommend a winner among external options. Surface tradeoffs; `/prd` picks.
|
||||||
|
|
||||||
|
## What not to do
|
||||||
|
|
||||||
|
- Don't write code. Not even illustrative snippets. The output is a brief, not a sketch.
|
||||||
|
- Don't open files yourself to skim — let the Explore agent do that. Synthesizing is your job.
|
||||||
|
- Don't fabricate. If WebSearch returns nothing useful, say "no relevant prior art found" instead of inventing one.
|
||||||
|
- Don't make product decisions. "Should we use X or Y?" → both, with tradeoffs, then "your call."
|
||||||
Reference in New Issue
Block a user