- package.json: remove @tensorflow/tfjs and @tensorflow/tfjs-node. Monster's TF code was already stripped; the deps were stale and kept pulling a heavy native binary back into every install. - .gitignore: ignore .repo-mem/ regenerable indexes and per-session .claude/*.lock runtime files. - CLAUDE.md: prepend READ-FIRST pointer to .claude/rules/repo-mem.md; collapse the 'three outputs' bullet to a pointer at node-architecture. - .claude/rules/telemetry.md: drop Port 0/1/2 duplication; reference node-architecture.md. - .claude/rules/testing.md: stop requiring a separate test/edge tier and the basic/integration/edge example flow trio. Reflects what nodes actually do. - .claude/rules/repo-mem.md (new): when-to-call-which guide for the per-repo memory MCP, anti-patterns, refresh model. - .mcp.json (new): wire repo-mem stdio server. - docs/DEVELOPER_GUIDE.md (new): step-by-step guide for adding a new EVOLV node under the three-layer pattern. - Bump nodes/pumpingStation to 6ab585b (docs + simulations refresh, spill-flow path renames consistent with d8490aa). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
4.5 KiB
repo-mem MCP Tools
This repo has a per-repo memory MCP server (repo-mem) wired via .mcp.json. It exposes 5 tools backed by a Hopfield substrate trained on EVOLV's source plus a BM25 index over file chunks. Use them. They are faster and better-targeted than grep for concept queries, and they accumulate institutional memory of repairs.
If /mcp does not list repo-mem as Connected, the rest of this file does not apply for this session — fall back to grep / Read.
When to call which tool
repo_search(query, k=8) — primary lookup tool
Use before grep / find / Explore agent for any natural-language "where is X handled / find all places that do Y / what code implements Z" question.
- ✅ "where is the predicted volume integrator?" →
repo_search - ✅ "find places that emit InfluxDB line protocol" →
repo_search - ❌ "find every occurrence of
_updatePredictedVolume" →grep(exact symbol — BM25 doesn't beat grep at exact-string lookup) - ❌ "list all
.test.jsfiles" →find/ls(no concept query)
Returns top-K files with file:line ranges and snippets. Read the snippet first; only open the file if the snippet doesn't answer the question.
repo_similar_fixes(query, failure?, files?, tags?, k=5) — start-of-task context
Call at the start of any non-trivial bug fix or behavioral change. Cheap (BM25 + file overlap + atom cosine), zero downside if it returns nothing useful.
- Pass the user's task description as
query. - If there's a failing test or stack trace, pass it as
failure. - If you already know which files are involved, pass them as
files. - Skim the returned traces; surface any near-match to the user before starting.
repo_record_fix({task, failure, files, diff_summary, patch, tests, outcome, tags}) — end-of-task persist
Call at the end of a landed fix or behavioral change, before reporting completion to the user. Skip for trivial typo/comment commits. Required fields: task and outcome. Recommended:
failure: the symptom that prompted the work (test output, user description, stack trace).files: the files actually changed.diff_summary: 1–3 sentences on what changed and why.patch: the unified diff (truncate to the load-bearing hunks if huge).tests: the verification command(s) you ran.outcome:passed/failed/partial/reverted.tags: short labels (overflow-clamp,tokenizer,migration, etc.) for retrieval bias.
Rule of thumb: if the change took more than one read+edit pair, record it.
substrate_score(text, worst_k=5) — OOD-token check
Use sparingly. After generating a non-trivial code block (≥ ~30 lines of new logic, not test scaffolding), pass it through substrate_score and inspect the worst-confidence positions for typos, wrong identifiers, or out-of-house style. Noisy on small additions — don't use it for one-line tweaks.
substrate_top_next(context, k=10) — rarely
Predicts next BPE-subword tokens in the local style. Mostly useful for autonomous solver loops; in interactive review it's diagnostic only. If you find yourself wanting it, you probably want repo_search instead.
Workflow shape
new task arrives
↓
repo_similar_fixes(query=user_task) ← cheap, always do this for non-trivial tasks
↓
repo_search(query=concept) ← when scoping
↓
[normal Read / Edit / Bash work]
↓
[after generating non-trivial new code]
substrate_score(text=new_block) ← optional, only if block is big
↓
[verify: tests / build / smoke]
↓
repo_record_fix({...}) ← before final user-facing summary
Anti-patterns
- ❌ Calling
repo_searchwhen you already know the file path. JustReadit. - ❌ Calling
repo_record_fixafter every micro-edit. Only at meaningful task boundaries. - ❌ Treating
substrate_top_nextresults as authoritative — they reflect repo style, not correctness. - ❌ Passing the full conversation to
substrate_score— it's per-snippet, not per-session.
Refresh model
The post-commit hook auto-runs --quick --lock (re-ingest + BM25 + chunk re-embed; substrate retrain skipped) so retrieval stays current within ~2 s of any commit. The substrate itself is only retrained when you (or a maintainer) run --full manually:
node ~/anchor-net-master/tools/repo-mem/refresh.mjs \
--repo . --in .repo-mem --full
Re-train when the repo gains substantially new vocabulary (new node, new domain, new dependency surface). Otherwise BM25 + existing atoms keep up.