chore(skills): add workflow chain — grill-me → prd → prd-to-issues → ship-it

Four workflow skills that take a feature from fuzzy idea to merged code.
Two human-in-the-loop phases (grill-me, prd), one mostly-together (prd-to-issues
files only on explicit 'create'), and one AFK (ship-it).

  grill-me        TOGETHER  pressure-test the idea with hard interview questions
  prd             TOGETHER  synthesize PRD; gaps stay explicit, not papered over
  prd-to-issues   MOSTLY    thin vertical-slice issues with coverage matrix +
                            per-issue Slice check; self-audits before showing
  ship-it         AFK       shell loop ships each slice end-to-end with one
                            commit per issue, status streams to terminal,
                            Ctrl-C-able, survives session close

Vertical-slice principle throughout: every issue cuts end-to-end through every
integration layer (no horizontal "do all the DB work first" issues). The
AFK loop only ships against acceptance criteria already locked in by the PRD
phase — autonomous code never runs against undefined contracts.

ship-it tracker support: gh (GitHub) and tea (Gitea). For this repo, set
SHIP_IT_TRUNK=development to override the main default.

See .claude/skills/README.md for the full how-to and a worked example.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
znetsixe
2026-05-21 16:27:15 +02:00
parent 025bdb4c7e
commit 6ff262e96e
7 changed files with 825 additions and 0 deletions

View File

@@ -0,0 +1,115 @@
---
name: ship-it
description: AFK autopilot. Drives a shell loop that works through every ready issue in the tracker (GitHub via gh, Gitea via tea), implementing each vertical slice end-to-end and committing per issue. Status streams to the terminal so the human can tail progress locally and Ctrl-C anytime. The shell is the loop; each iteration dispatches one fresh headless Claude run to ship one issue. Use when the user invokes /ship-it, says "go AFK on this", "work the backlog", "ralph the issues", or "ship everything".
---
# Ship It — AFK backlog autopilot
**Mode: AFK.** No human in the loop. Does not ask questions mid-run. If a slice is undecidable, the iteration labels the issue `needs-decision` and the loop moves on. The human gets one summary at the end, not chatter during.
## How this works (read before invoking)
The actual loop runs in a shell script: `.claude/skills/ship-it/loop.sh`. **The shell is the loop**, not you. Each iteration shells out to a fresh, headless `claude -p` invocation that processes exactly one issue using `.claude/skills/ship-it/iterate.md` as its prompt. Three reasons this design beats "LLM keeps going inside one session":
1. **Fresh context per issue.** No drift, no accumulated history bloating the window.
2. **Visible in the terminal.** Progress streams to stdout and tees to a log file. The human can tail it from another shell, see commits land, and Ctrl-C cleanly.
3. **Survives session close.** Closing the interactive Claude window doesn't kill the loop. Re-attach by tailing the log.
## Files
- `loop.sh` — orchestrator. Tracker detection, preflight, dispatch loop, status output, stop conditions, summary.
- `iterate.md` — the prompt passed to each per-issue headless Claude. Read it; it defines what "shipped" means.
- `SKILL.md` — this file. When the user invokes `/ship-it`, you bootstrap and hand off.
## When the user invokes /ship-it
You (the interactive Claude) do the bootstrap, not the work. Concretely:
1. **Preflight in chat** (catches the obvious failures before the script runs):
- `git status --porcelain` empty?
- On `main` (or `$SHIP_IT_TRUNK`)? Up-to-date with origin?
- `gh auth status` (or tea token) returns 0?
- `gh issue list --state open --label slice | wc -l` ≥ 1?
2. **Show the plan** in one short block: tracker host, trunk branch, count of ready issues, the first 3 issue titles, the log path. Nothing more.
3. **Ask one question:** "Start? Reply `go`." This is the *only* human-in-the-loop checkpoint — kicking off AFK work is a real commitment, deserves an explicit ok.
4. **On `go`:** run the loop in the foreground so the user sees live output:
```
bash .claude/skills/ship-it/loop.sh
```
Do not background it. Do not pipe through anything that buffers. The user can Ctrl-C.
5. **While it runs:** stay silent. Don't interject. Don't "monitor" by re-reading logs in chat — the user has the terminal.
6. **When it exits:** read the final `==== ship-it summary ====` block from the log file, present it once with concrete next steps ("2 issues are `needs-decision` — open them to answer their questions?").
## Following progress
The script logs to stdout AND tees to `.ship-it-logs/run-<RUN_ID>.log`. Tail from another terminal:
```bash
tail -f .ship-it-logs/run-*.log
```
Per-issue detail (everything the headless Claude did for that one issue) is in `.ship-it-logs/iter-<RUN_ID>-<ISSUE>.log` — useful for debugging a failed iteration.
Commits land in git as the loop runs. Watch with:
```bash
watch -n 5 'git log --oneline -10 origin/main'
```
## Config (env vars, override before invoking)
| Var | Default | Purpose |
|---|---|---|
| `SHIP_IT_MAX` | 50 | Hard cap on iterations per run |
| `SHIP_IT_MAX_FAIL` | 3 | Consecutive failures before stop |
| `SHIP_IT_TRUNK` | `main` | Trunk branch name |
| `SHIP_IT_TIMEOUT` | `30m` | Per-issue timeout (kills the headless claude) |
| `SHIP_IT_LOG_DIR` | `<repo>/.ship-it-logs` | Where logs go |
## What each iteration does (per `iterate.md`)
For one issue: read it → branch from trunk → write failing e2e test at the outermost layer → implement layer by layer until the test passes → run the full suite → outermost-layer smoke check → commit (one commit, message ends `Closes #N`) → push → open PR with acceptance-criteria checkboxes + smoke evidence → wait for CI → merge if green and branch protection allows, else leave open for review → return to trunk → emit `ITERATION_RESULT:` line for the loop.
**Commit per issue:** yes, exactly. One commit per slice, referenced to the issue, lands on the branch before the PR opens. The slice scope was made small in `/prd-to-issues` precisely so this is one tight commit, not a series.
## Stop conditions (in priority order)
1. **User Ctrl-C** → trap catches SIGINT, current step finishes cleanly, summary prints, exit 130.
2. **Backlog empty** (no ready issues) → exit 0.
3. **Three consecutive hard failures** → exit 1. Something systemic — bad dependency, branch protection blocking, flaky env. Surfaces for human review.
4. **Precondition violated mid-run** → exit non-zero with reason.
## What "ready" means (the loop's filter)
An issue is `ready` iff:
- State is open
- Has label `slice` (filed by `/prd-to-issues`)
- Does NOT have label `blocked`, `needs-decision`, or `ci-failed`
- Is not a spike (spikes deliver decisions, not code — humans handle those)
Issues are processed in number order — walking-skeleton first, as `/prd-to-issues` ordered them.
## Safety boundaries
The headless Claude is launched with a tool allowlist that excludes destructive operations. It cannot:
- Force-push or rewrite shared history
- Bypass branch protection or skip CI hooks (`--no-verify`, `--admin`)
- Auto-merge red or pending PRs (the iterate prompt forbids it, and CI gates back it up)
- Modify CI/CD config or IaC unless the slice's `Slice — layers touched` line explicitly names that layer
- Close issues without the outermost-layer smoke check passing
- Assign people or change milestones/projects
If something tries to push past these in practice (e.g. a slice "needs" a CI change to pass), it should fail the iteration with `needs-decision` and let a human approve the scope expansion.
## What not to do
- **Don't drive the loop yourself by reading issues and implementing them inline.** The shell is the loop. If you're tempted to "just do this one in chat," stop and run the script.
- **Don't background the script** so the user can keep chatting with you. The output IS the value. The user wants to watch it work.
- **Don't summarize between iterations.** Chatter belongs in the final summary, not after each commit.
- **Don't tag the user in PR/issue comments** during the run. They're not in the loop until the script exits.
- **Don't restart a failed iteration manually.** The loop's `needs-decision` and `ci-failed` labels are how failures stay in the tracker for human triage. Manual restart skips that.
## How this fits the chain
`/grill-me <feature>` (together) → `/prd` (together) → `/prd-to-issues` (mostly together, file step needs `create`) → `/ship-it` (AFK). The four-skill arc takes a vague feature idea to merged code with one human checkpoint per phase boundary.

View File

@@ -0,0 +1,70 @@
# ship-it iterate — one issue, end-to-end
You are running ONE iteration of the ship-it AFK loop. Implement, verify, and ship exactly one issue, then exit. The outer shell loop will pick the next one.
**Mode: AFK.** Do not ask questions. If the issue is genuinely undecidable from its body + linked PRD + grilling notes already in the issue or repo, drop a comment on the issue with the specific question, label it `needs-decision`, and exit with status=needs-decision. Do not guess at user intent.
Variables provided below this prompt: `ISSUE_NUMBER`, `TRACKER_CLI` (`gh` or `tea`), `TRUNK_BRANCH`, `REPO_ROOT`.
## Steps
1. **Read the issue.**
- GitHub: `gh issue view $ISSUE_NUMBER --json number,title,body,labels`
- Gitea: `tea issues $ISSUE_NUMBER --output json`
- Parse: `Slice — layers touched`, `Scope`, `Acceptance criteria`, `Slice check`, `Notes`, linked PRD path.
- If `Acceptance criteria` is missing or non-testable → exit status=needs-decision with reason "acceptance criteria not testable".
2. **Branch from latest trunk.**
`git fetch origin && git switch -c "slice/${ISSUE_NUMBER}-<short-kebab-slug>" "origin/$TRUNK_BRANCH"`
3. **Write the failing e2e test first.** Anchored at the OUTERMOST layer named in `Slice — layers touched` (HTTP endpoint, UI smoke, dashboard query, log assertion — whatever the acceptance criterion observes). Run it. Confirm it fails for the right reason. If you can't write an e2e test for this slice, that's a sign the acceptance criterion isn't really observable end-to-end → exit status=needs-decision.
4. **Implement layer by layer.** Walk the `Slice — layers touched` list. Make the minimal change at each layer to satisfy the slice — do not gold-plate, do not refactor adjacent code, do not "improve" things outside scope. Re-run the e2e test after each layer change.
5. **Run the broader test suite.** Catch regressions caused by the slice. Fix any test that was green before and is now red — do not skip or mark tests. If a test was already red before your changes, leave it (note in PR body).
6. **Outermost-layer smoke check.** The 30-second-demo check: hit the endpoint with curl, query the dashboard, tail the log, load the page. Observe what the acceptance criterion observes. Capture the output (curl response body, log snippet, query result) — you'll paste it into the PR body as evidence.
7. **Commit.** One commit per slice (or a tight series — no WIP commits, no fixup commits, no "address review" before review exists). Read the repo's recent `git log` to match commit style. Message ends with `Closes #${ISSUE_NUMBER}`.
8. **Push and open PR.**
- GitHub: `git push -u origin HEAD && gh pr create --fill`
- Gitea: `git push -u origin HEAD && tea pr create --title "..." --description "..."`
- PR body must include:
- Each acceptance criterion as a checked `- [x]` line.
- The smoke-check evidence (curl output / log snippet / screenshot path) in a fenced block.
- `Closes #${ISSUE_NUMBER}` (so the issue auto-closes on merge).
9. **Wait for CI and decide merge.**
- Poll: `gh pr checks --watch` (or `tea pr status`).
- **All green + branch protection allows direct merge** → `gh pr merge --squash --delete-branch`. Verify the merge commit landed on trunk.
- **All green + branch protection requires human review** → leave PR open. Comment `Ready for review — all acceptance criteria verified, smoke check passed.` on the issue. Exit status=shipped with the PR number.
- **Red CI** → one fix-and-push cycle. Read the failing log, fix the actual cause (do not skip the test). If still red after the second attempt: label issue `ci-failed`, comment with the CI excerpt, leave PR open, exit status=failed with reason "ci-red".
10. **Return to trunk.** `git switch $TRUNK_BRANCH && git pull --ff-only`. If the slice was merged, run the smoke check one more time against integrated trunk. If it fails there → revert the merge, label `regression`, exit status=failed with reason "regression-on-trunk".
## Boundaries
- Never force-push, never rewrite shared history, never delete branches you didn't create.
- Never bypass branch protection (`--admin`) or skip CI hooks (`--no-verify`).
- Never auto-merge a PR whose CI is red or pending.
- Never close an issue without the outermost-layer smoke check passing.
- Never modify CI/CD config, IaC, or production data unless the slice's `layers touched` explicitly names that layer.
- Never invent acceptance criteria. If they're vague, label `needs-decision`.
- Never assign issues or change milestones.
## Final output line
The shell loop greps for this exact line to determine outcome. Print it as the LAST line before exiting, on its own line, no decoration:
```
ITERATION_RESULT: status=<shipped|failed|needs-decision> issue=#<N> pr=<#N|none> reason=<short single-line reason>
```
Examples:
```
ITERATION_RESULT: status=shipped issue=#142 pr=#287 reason=merged-to-main
ITERATION_RESULT: status=shipped issue=#143 pr=#288 reason=open-for-review
ITERATION_RESULT: status=failed issue=#144 pr=#289 reason=ci-red-after-retry
ITERATION_RESULT: status=needs-decision issue=#145 pr=none reason=acceptance-criteria-not-testable
```

View File

@@ -0,0 +1,189 @@
#!/usr/bin/env bash
# ship-it AFK loop — works through every ready issue end-to-end.
# See SKILL.md for design. Ctrl-C to stop; partial work is preserved on disk.
set -uo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
REPO_ROOT="$(git rev-parse --show-toplevel 2>/dev/null)" || { echo "not in a git repo"; exit 1; }
# ---- config (env-overridable) ----
MAX_ITERATIONS="${SHIP_IT_MAX:-50}"
MAX_CONSECUTIVE_FAILURES="${SHIP_IT_MAX_FAIL:-3}"
TRUNK_BRANCH="${SHIP_IT_TRUNK:-main}"
ITERATION_TIMEOUT="${SHIP_IT_TIMEOUT:-30m}" # per-issue cap
LOG_DIR="${SHIP_IT_LOG_DIR:-$REPO_ROOT/.ship-it-logs}"
mkdir -p "$LOG_DIR"
RUN_ID="$(date -u +%Y%m%dT%H%M%SZ)"
LOG_FILE="$LOG_DIR/run-$RUN_ID.log"
# ---- logging ----
log() {
local ts; ts="$(date -u +%H:%M:%S)"
printf '[%s] %s\n' "$ts" "$*" | tee -a "$LOG_FILE"
}
die() { log "FATAL: $*"; exit 1; }
# ---- graceful interrupt ----
INTERRUPTED=0
on_interrupt() {
INTERRUPTED=1
log ""
log "interrupt received — finishing current step cleanly, then stopping"
}
trap on_interrupt INT
# ---- tracker detection ----
ORIGIN_URL="$(git -C "$REPO_ROOT" remote get-url origin 2>/dev/null || true)"
if [[ "$ORIGIN_URL" == *"github.com"* ]]; then
TRACKER_CLI="gh"
command -v gh >/dev/null || die "gh CLI not installed"
gh auth status >/dev/null 2>&1 || die "gh not authenticated (run: gh auth login)"
list_ready_issues() {
gh issue list --state open --label slice --limit 100 \
--json number,title,labels \
--jq '[.[] | select(.labels | map(.name) | (contains(["blocked"]) or contains(["needs-decision"]) or contains(["ci-failed"])) | not)] | sort_by(.number)'
}
elif [[ "$ORIGIN_URL" == *"gitea"* ]]; then
TRACKER_CLI="tea"
command -v tea >/dev/null || die "tea CLI not installed (Gitea repo detected) — install tea or switch to a GitHub remote"
list_ready_issues() {
tea issues list --state open --output json 2>/dev/null \
| jq '[.[] | select((.labels // []) | map(.name) | (contains(["blocked"]) or contains(["needs-decision"]) or contains(["ci-failed"])) | not) | select((.labels // []) | map(.name) | contains(["slice"]))] | sort_by(.index)'
}
else
die "unknown tracker for origin: '$ORIGIN_URL' (need github.com or gitea.*)"
fi
# ---- preflight ----
cd "$REPO_ROOT"
[[ -z "$(git status --porcelain)" ]] || die "git tree is dirty — commit or stash before starting"
CURRENT_BRANCH="$(git branch --show-current)"
[[ "$CURRENT_BRANCH" == "$TRUNK_BRANCH" ]] || die "not on $TRUNK_BRANCH (on '$CURRENT_BRANCH')"
git fetch origin "$TRUNK_BRANCH" >/dev/null 2>&1 || die "git fetch failed"
LOCAL_SHA="$(git rev-parse HEAD)"
REMOTE_SHA="$(git rev-parse "origin/$TRUNK_BRANCH")"
[[ "$LOCAL_SHA" == "$REMOTE_SHA" ]] || die "$TRUNK_BRANCH not up-to-date with origin (pull first)"
command -v claude >/dev/null || die "claude CLI not on PATH"
# ---- banner ----
log "ship-it run $RUN_ID"
log " tracker: $TRACKER_CLI ($ORIGIN_URL)"
log " trunk: $TRUNK_BRANCH @ ${LOCAL_SHA:0:8}"
log " log: $LOG_FILE"
log " config: max_iter=$MAX_ITERATIONS, max_fail=$MAX_CONSECUTIVE_FAILURES, timeout=$ITERATION_TIMEOUT"
log ""
ITERATE_PROMPT_TEMPLATE="$(cat "$SCRIPT_DIR/iterate.md")"
SHIPPED=()
FAILED=()
NEEDS_DECISION=()
CONSECUTIVE_FAILURES=0
ITERATION=0
# ---- main loop ----
while (( ITERATION < MAX_ITERATIONS )); do
(( INTERRUPTED )) && break
ITERATION=$((ITERATION + 1))
READY_JSON="$(list_ready_issues 2>/dev/null || echo '[]')"
READY_COUNT="$(echo "$READY_JSON" | jq 'length' 2>/dev/null || echo 0)"
if (( READY_COUNT == 0 )); then
log "backlog empty — stopping"
break
fi
ISSUE_NUM="$(echo "$READY_JSON" | jq -r '.[0].number // .[0].index')"
ISSUE_TITLE="$(echo "$READY_JSON" | jq -r '.[0].title')"
log "─────────────────────────────────────────────────────────────"
log "iter $ITERATION | #$ISSUE_NUM \"$ISSUE_TITLE\" ($READY_COUNT ready) → starting"
ITER_LOG="$LOG_DIR/iter-$RUN_ID-$ISSUE_NUM.log"
PROMPT="$ITERATE_PROMPT_TEMPLATE
## Variables for this iteration
- ISSUE_NUMBER=$ISSUE_NUM
- TRACKER_CLI=$TRACKER_CLI
- TRUNK_BRANCH=$TRUNK_BRANCH
- REPO_ROOT=$REPO_ROOT
Begin."
ITER_START="$(date +%s)"
set +e
timeout "$ITERATION_TIMEOUT" claude -p "$PROMPT" \
--allowed-tools "Bash,Edit,Write,Read,Grep,Glob,WebFetch" \
--output-format text \
>"$ITER_LOG" 2>&1
CLAUDE_EXIT=$?
set -e
ITER_END="$(date +%s)"
ITER_DURATION=$((ITER_END - ITER_START))
RESULT_LINE="$(grep -E '^ITERATION_RESULT:' "$ITER_LOG" | tail -1 || true)"
STATUS="$(echo "$RESULT_LINE" | sed -n 's/.*status=\([^ ]*\).*/\1/p')"
PR_FIELD="$(echo "$RESULT_LINE" | sed -n 's/.*pr=\([^ ]*\).*/\1/p')"
REASON="$(echo "$RESULT_LINE" | sed -n 's/.*reason=\(.*\)/\1/p')"
if (( CLAUDE_EXIT == 124 )); then
STATUS="failed"
REASON="timeout after $ITERATION_TIMEOUT"
fi
case "$STATUS" in
shipped)
log "iter $ITERATION | #$ISSUE_NUM ✓ shipped → PR $PR_FIELD (${ITER_DURATION}s)"
SHIPPED+=("#$ISSUE_NUM$PR_FIELD")
CONSECUTIVE_FAILURES=0
;;
failed)
log "iter $ITERATION | #$ISSUE_NUM ✗ failed: $REASON (${ITER_DURATION}s, see $ITER_LOG)"
FAILED+=("#$ISSUE_NUM ($REASON)")
CONSECUTIVE_FAILURES=$((CONSECUTIVE_FAILURES + 1))
;;
needs-decision)
log "iter $ITERATION | #$ISSUE_NUM ? needs-decision: $REASON (${ITER_DURATION}s)"
NEEDS_DECISION+=("#$ISSUE_NUM ($REASON)")
CONSECUTIVE_FAILURES=0
;;
*)
log "iter $ITERATION | #$ISSUE_NUM ! unknown outcome (claude exit=$CLAUDE_EXIT, ${ITER_DURATION}s) — see $ITER_LOG"
FAILED+=("#$ISSUE_NUM (unknown outcome)")
CONSECUTIVE_FAILURES=$((CONSECUTIVE_FAILURES + 1))
;;
esac
if (( CONSECUTIVE_FAILURES >= MAX_CONSECUTIVE_FAILURES )); then
log "$MAX_CONSECUTIVE_FAILURES consecutive failures — stopping for human review"
break
fi
# back to trunk for next iteration
if [[ "$(git branch --show-current)" != "$TRUNK_BRANCH" ]]; then
git switch "$TRUNK_BRANCH" >/dev/null 2>&1 || log " warn: could not return to $TRUNK_BRANCH"
fi
git pull --ff-only origin "$TRUNK_BRANCH" >/dev/null 2>&1 || log " warn: could not fast-forward $TRUNK_BRANCH"
done
# ---- summary ----
log ""
log "==== ship-it summary ===="
log "iterations: $ITERATION"
log "shipped: ${#SHIPPED[@]} ${SHIPPED[*]:-}"
log "failed: ${#FAILED[@]} ${FAILED[*]:-}"
log "needs-decision: ${#NEEDS_DECISION[@]} ${NEEDS_DECISION[*]:-}"
log "log: $LOG_FILE"
if (( INTERRUPTED )); then
log "stop reason: user-interrupt"
exit 130
elif (( CONSECUTIVE_FAILURES >= MAX_CONSECUTIVE_FAILURES )); then
log "stop reason: consecutive-failures"
exit 1
else
log "stop reason: backlog-empty"
exit 0
fi