fix(memory): resolve claude on PATH for backfill + surface per-session failures (#276) by khustup2 · Pull Request #277 · activeloopai/hivemind

khustup2 · 2026-06-18T16:12:14Z

Fixes #276 — hivemind memory backfill stages 0/N (0 embedded, N failed) with no surfaced reason.

What was wrong

1. The failure reason was swallowed (the explicit ask)

stageSession already computes a precise reason for every outcome — claude-failed, no-summary, empty-summary, jsonl-missing, mkdir-failed — but it was dropped at the StageFn boundary (defaultStageFn returned only {ok, embedded}) and executeBackfill just did summary.failed++. So the CLI could only print a bare N failed.

Now the reason is carried through, tallied per-reason, recorded per-session. The report always prints the breakdown, and --verbose/-v lists each failing session. A thrown stager is captured as threw: <message>.

2. Root cause: claude binary not found

The stager resolved claude via findAgentBin("claude_code") — a hard-coded candidate-list resolver with no PATH walk, falling back to ~/.claude/local/claude. When claude lives off that list (e.g. an nvm/npm-global bin — the reporter's macOS+nvm setup), the fallback doesn't exist → every spawn ENOENTs → every session is claude-failed. The live wiki-worker uses the PATH-aware resolveCliBin("claude"); this aligns backfill with it and reuses buildClaudeInvocation for the Windows .cmd path. (stage-memory.ts is in the CLI bundle, not the ClawHub-scanned openclaw worker, so the PATH-free constraint behind findAgentBin doesn't apply.)

Reproduced in isolation

Simulated the reporter's environment (claude on PATH but off every findAgentBin candidate, isolated HOME so the fallback is absent) and drove the real stageSession through both resolvers:

OLD findAgentBin   -> <home>/.claude/local/claude   exists: false
                      stageSession -> { ok:false, reason:"claude-failed" }   <- the reporter's "N failed"
NEW resolveCliBin  -> <home>/nvm-bin/claude          exists: true
                      stageSession -> { ok:true }                            <- staged

The exact symptom (0/N, 0 embedded, N failed) reproduced, the swallowed reason shown to be claude-failed, and the fix flipping it to staged. (Repro manipulates HOME/PATH/which — kept as throwaway verification, not a Windows-fragile committed test.)

Caveat: this reproduces the claude-off-the-candidate-list case. If a user's claude IS at a candidate path, their failure is a different reason (e.g. no-summary) — which the surfacing change now makes visible instead of swallowing.

Tests & coverage

Pure units extracted (planClaudeSpawn, summarizeExtract, renderFailures, releaseBackfillLock); isolated mock files cover the Windows stdin spawn path and the real defaultEmbed enabled/disabled branches without spawning claude or an embed daemon.

File	Stmts	Branch	Funcs	Lines
`commands/backfill-memory.ts`	99.3	96.6	94.7	99.2
`skillify/stage-memory.ts`	100	94.7	100	100

Both >=90% on every metric; branch thresholds raised 85->90. Typecheck clean; targeted suites green (58 tests); full suite shows only pre-existing build-artifact failures (unchanged from baseline).

🤖 Generated with Claude Code

Summary by CodeRabbit

New Features
- Added --verbose for memory backfill with richer, per-session failure diagnostics.
- Memory backfill now supports additional optional backfill flags (--window-days, --project-only) in the usage message.
Bug Fixes
- Improved Claude executable resolution to better prevent missing-binary failures.
- Enhanced Windows-specific Claude invocation handling and safer lock-file cleanup.
Tests
- Expanded backfill failure-reporting, exit-code, and lock-ownership coverage.
- Added embedding and Windows stdin/shell staging test coverage.
- Tightened coverage thresholds for memory backfill and staging.

…n failures `hivemind memory backfill` staged 0/N with a bare "N failed" and no way to see why (#276). Root cause (most likely): the stager resolved the claude binary via findAgentBin's hard-coded candidate list, which has no PATH walk and falls back to ~/.claude/local/claude. When claude lives elsewhere (e.g. an nvm/npm-global bin), the fallback doesn't exist, every spawn ENOENTs, and each session is recorded as `claude-failed`. The live wiki-worker already uses the PATH-aware resolveCliBin; align backfill with it and reuse buildClaudeInvocation so the Windows `.cmd` shim path is handled too. Surfacing: stageSession already computed a precise `reason` (claude-failed / no-summary / empty-summary / jsonl-missing / mkdir-failed) but it was dropped at the StageFn boundary. Carry it through, tally per reason, record per-session detail, always print the breakdown, and add --verbose to list each failing session. A bare "N failed" is now self-diagnosable. Tests + coverage: extract pure planClaudeSpawn / summarizeExtract / renderFailures / releaseBackfillLock units; cover the Windows stdin spawn and the real defaultEmbed enabled/disabled branches via isolated mocks. backfill-memory.ts and stage-memory.ts both ≥90% on every metric; raise their branch thresholds 85→90.

coderabbitai · 2026-06-18T16:12:27Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: dc6387b9-f039-403e-97c6-5009c8ae0342

📥 Commits

Reviewing files that changed from the base of the PR and between 28224f8 and c0f8adb.

📒 Files selected for processing (1)

tests/claude-code/stage-memory-shell-spawn.test.ts

🚧 Files skipped from review as they are similar to previous changes (1)

tests/claude-code/stage-memory-shell-spawn.test.ts

📝 Walkthrough

Walkthrough

Adds --verbose flag and structured failure reporting (BackfillFailure, failureReasons, renderFailures, summarizeExtract) to the memory backfill pipeline. Fixes Claude binary resolution to use the PATH-aware resolveCliBin("claude") and introduces planClaudeSpawn for Windows .cmd stdin spawning. Test coverage and CLI help text are updated accordingly.

Changes

Backfill verbose failure reporting and Windows spawn fix

Layer / File(s)	Summary
Verbose option and CLI help `src/cli/index.ts`, `src/commands/backfill-memory.ts`	`BackfillOptions` adds `verbose` boolean flag parsed from `--verbose/-v`; CLI help text documents new backfill flags in the usage message.
Failure model and data contracts `src/commands/backfill-memory.ts`	`BackfillFailure` interface introduced; `ExtractSummary` augmented with `failureReasons` (reason → count map) and `failures` (per-session entries); `StageFn` result extended with optional `reason` field.
Failure recording and reason propagation `src/commands/backfill-memory.ts`	Default stage function forwards failure reason from `stageSession` into `StageFn` result; `recordFailure` helper increments counts and appends entries; worker loop captures reason on `ok:false` and normalizes thrown errors to `"threw: ..."` messages.
Lock generalization, runExtract export, and reporting helpers `src/commands/backfill-memory.ts`	`releaseBackfillLock` accepts optional `lockPath` parameter; `runBackfillMemory` delegates to exported `runExtract(plan, cwd, opts)` under try/finally lock-release guard; `renderFailures` generates reason tallies and verbose lines; `summarizeExtract` assembles headline and diagnostics with exit code; `runExtract` prints summarized lines and returns exit code.
ClaudeSpawnPlan, planClaudeSpawn, and resolveClaudeBin `src/skillify/stage-memory.ts`	`ClaudeSpawnPlan` and `planClaudeSpawn` translate `buildClaudeInvocation` output into concrete `spawn()` parameters with conditional stdin piping for Windows `.cmd` shims; `runClaude` rewired to use the plan; `resolveClaudeBin` switched to PATH-aware `resolveCliBin("claude")`.
backfill-memory test coverage `tests/claude-code/backfill-memory.test.ts`	Tests cover `--verbose`/`-v` parsing, `executeBackfill` failure reason propagation (explicit reason, unknown fallback, thrown errors), `renderFailures` output for tallies and verbose lines, `summarizeExtract` exit codes and diagnostics, `runExtract` empty-plan short-circuit, and `releaseBackfillLock` ownership-gated deletion.
planClaudeSpawn, shell-spawn, and embed tests `tests/claude-code/stage-memory.test.ts`, `tests/claude-code/stage-memory-shell-spawn.test.ts`, `tests/claude-code/stage-memory-embed.test.ts`	Unit tests for `planClaudeSpawn` across Unix and Windows `.cmd` scenarios; end-to-end shell-spawn test using hoisted stdin reader; `stageSession` embedding persistence and opt-out via mocked `EmbedClient` and gating hooks.
Coverage thresholds and CLI test update `vitest.config.ts`, `tests/cli/cli-index.test.ts`	Branch coverage thresholds raised to 90 for `stage-memory.ts` and `backfill-memory.ts`; expected stderr usage string updated for new backfill flags.

Sequence Diagram(s)

sequenceDiagram
  participant CLI as runBackfillMemory
  participant runExtract as runExtract(plan, cwd, opts)
  participant executeBackfill as executeBackfill worker
  participant stageSession as stageSession / stage()
  participant summarizeExtract as summarizeExtract()
  participant renderFailures as renderFailures()

  CLI->>runExtract: plan, cwd, opts (verbose)
  runExtract->>executeBackfill: iterate sessions
  loop per session
    executeBackfill->>stageSession: stage(session)
    alt ok: false
      stageSession-->>executeBackfill: { ok:false, reason }
      executeBackfill->>executeBackfill: recordFailure(reason)
    else throws
      stageSession-->>executeBackfill: throws Error
      executeBackfill->>executeBackfill: recordFailure("threw: <msg>")
    end
  end
  executeBackfill-->>runExtract: ExtractSummary (failureReasons, failures)
  runExtract->>summarizeExtract: result, opts.verbose
  summarizeExtract->>renderFailures: result, verbose
  renderFailures-->>summarizeExtract: failure lines
  summarizeExtract-->>runExtract: { lines, exitCode }
  runExtract-->>CLI: exitCode

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related issues

memory backfill stages 0/N (0 embedded, N failed) with no surfaced error on clean supported install #276: This PR directly implements the --verbose flag and surfaces per-session failure reasons requested in the issue, via BackfillFailure, failureReasons/failures on ExtractSummary, and renderFailures.

Possibly related PRs

activeloopai/hivemind#250: Both PRs modify Claude CLI invocation for Windows by reusing shared resolveCliBin/buildClaudeInvocation logic—this PR adapts stage-memory.ts to the same PATH-aware binary resolution and spawn planning introduced for the wiki-worker.
activeloopai/hivemind#269: Main PR enhances the existing retro-backfill feature from PR #269 by extending src/commands/backfill-memory.ts failure aggregation/reporting (verbose, failureReasons, StageFn.reason, renderFailures/summarizeExtract) and by updating src/skillify/stage-memory.ts spawn planning + resolveClaudeBin resolution behavior—both directly touching the same core modules introduced in #269.

Suggested reviewers

kaghni
efenocchi

Poem

🐇 Hoppity-hop through the backfill queue,
Failures now tagged with a reason true!
Windows .cmd gets its stdin fed,
--verbose shines light on what went dead.
With coverage raised and all tests green,
The fluffiest pipeline ever seen! 🌿

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 77.78% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately describes the two main changes: fixing claude binary resolution on PATH and surfacing per-session failure reasons in the memory backfill command.
Description check	✅ Passed	The description comprehensively covers the problem statement, root causes, reproduction steps, test coverage, and validation metrics, though it lacks an explicit version bump note per the template.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/backfill-surface-errors

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-06-18T16:14:12Z

Coverage Report

Scope: files changed in this PR. Enforced threshold: 90% per metric (per file via vitest.config.ts).

Status	Category	Percentage	Covered / Total
🟢	Lines	90.59% (🎯 90%)	356 / 393
🟢	Statements	91.90% (🎯 90%)	431 / 469
🟢	Functions	97.83% (🎯 90%)	45 / 46
🟢	Branches	90.30% (🎯 90%)	270 / 299

File Coverage — 3 files changed

File	Stmts	Branches	Functions	Lines
`src/cli/index.ts`	🔴 86.2%	🔴 85.5%	🟢 100.0%	🔴 83.9%
`src/commands/backfill-memory.ts`	🟢 99.3%	🟢 97.8%	🟢 94.7%	🟢 99.2%
`src/skillify/stage-memory.ts`	🟢 100.0%	🟢 94.7%	🟢 100.0%	🟢 100.0%

_{Generated for commit 1f0bf9d.}

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (1)

tests/claude-code/backfill-memory.test.ts (1)
297-299: ⚡ Quick win

Prefer exact output assertions over substring checks in these tests.

Several checks here rely on generic toContain(...) matches. Please assert exact expected lines/messages (or exact rendered arrays) so regressions in wording/structure are caught deterministically.

As per coding guidelines, "tests/**: Prefer asserting on specific values (paths, messages) over generic substrings."

Also applies to: 321-323, 328-329, 360-360
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/claude-code/backfill-memory.test.ts` around lines 297 - 299, The test
assertions in the backfill-memory.test.ts file use generic toContain() substring
matching (at lines 297-299, 321-323, 328-329, and 360), which won't catch
regressions in output wording or structure. Replace these generic substring
checks with exact assertions that verify specific complete lines or the exact
rendered output format. For example, instead of checking if the joined output
contains a substring like "--verbose", assert that a specific expected line
appears in the output array, or verify the complete output matches an expected
format exactly.
Source: Coding guidelines

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/commands/backfill-memory.ts`:
- Around line 339-343: The try-finally block currently only wraps the runExtract
call, but the backfill lock is acquired before this block in the planBackfill
and renderPlan operations. If either of those operations throws an error, the
lock will not be released. Restructure the code by moving the beginning of the
try block to encompass the planBackfill and renderPlan calls as well, so that
the releaseBackfillLock call in the finally block will properly execute and
release the lock regardless of whether the failure occurs during planning,
rendering, or execution.

In `@tests/claude-code/stage-memory-shell-spawn.test.ts`:
- Line 90: Replace the generic substring assertion toContain("from stdin") with
an exact content assertion using toBe() in the readFileSync expectation on line
90. Since the file content is deterministic, assert the complete expected file
content rather than just checking if it contains a substring. Capture the full
expected markdown content that should be written to the claude_code-s1.md file
and compare it exactly.

---

Nitpick comments:
In `@tests/claude-code/backfill-memory.test.ts`:
- Around line 297-299: The test assertions in the backfill-memory.test.ts file
use generic toContain() substring matching (at lines 297-299, 321-323, 328-329,
and 360), which won't catch regressions in output wording or structure. Replace
these generic substring checks with exact assertions that verify specific
complete lines or the exact rendered output format. For example, instead of
checking if the joined output contains a substring like "--verbose", assert that
a specific expected line appears in the output array, or verify the complete
output matches an expected format exactly.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: cee94838-18f8-4bf8-9b52-e1775d34406a

📥 Commits

Reviewing files that changed from the base of the PR and between 94b66dd and 641d15f.

📒 Files selected for processing (9)

src/cli/index.ts
src/commands/backfill-memory.ts
src/skillify/stage-memory.ts
tests/claude-code/backfill-memory.test.ts
tests/claude-code/stage-memory-embed.test.ts
tests/claude-code/stage-memory-shell-spawn.test.ts
tests/claude-code/stage-memory.test.ts
tests/cli/cli-index.test.ts
vitest.config.ts

…rings CodeRabbit review on #277: - Widen runBackfillMemory's try so the finally always releases the lock — a planBackfill/renderPlan throw must not strand a worker-owned lock. Safe: releaseBackfillLock no-ops unless this process owns the lock, so the dry-run early return under it is unaffected. - Add docstrings to the exported backfill/stage-memory functions and interfaces to satisfy the docstring-coverage pre-merge check.

CodeRabbit review on #277: replace toContain with an exact toBe match on the deterministic summary content.

coderabbitai Bot reviewed Jun 18, 2026

View reviewed changes

Comment thread src/commands/backfill-memory.ts

Comment thread tests/claude-code/stage-memory-shell-spawn.test.ts Outdated

khustup2 added 2 commits June 18, 2026 16:37

test(memory): assert exact staged summary content in shell-spawn repro

c0f8adb

CodeRabbit review on #277: replace toContain with an exact toBe match on the deterministic summary content.

khustup2 requested a review from efenocchi June 18, 2026 18:12

efenocchi approved these changes Jun 18, 2026

View reviewed changes

khustup2 merged commit 3dc256a into main Jun 18, 2026
11 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(memory): resolve claude on PATH for backfill + surface per-session failures (#276)#277

fix(memory): resolve claude on PATH for backfill + surface per-session failures (#276)#277
khustup2 merged 3 commits into
mainfrom
fix/backfill-surface-errors

khustup2 commented Jun 18, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 18, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related issues

Possibly related PRs

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

github-actions Bot commented Jun 18, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

khustup2 commented Jun 18, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What was wrong

1. The failure reason was swallowed (the explicit ask)

2. Root cause: claude binary not found

Reproduced in isolation

Tests & coverage

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related issues

Possibly related PRs

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

github-actions Bot commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Coverage Report

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

khustup2 commented Jun 18, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 18, 2026 •

edited

Loading

github-actions Bot commented Jun 18, 2026 •

edited

Loading