Scoped review lanes now have clear scope but still emit only
the review request in stop events, not the actual verdict.
Adding requirement for structured approve/reject/blocked events.
Source: gaebal-gajae dogfood analysis 2026-04-12
ROADMAP #38 no longer reflects current main. The runtime already runs a
post-compaction session-health probe, but the backlog lacked explicit
regression proof. This change adds focused tests for the two important
behaviors: a broken tool surface aborts a compacted session with a targeted
error, while a freshly compacted empty session does not false-positive as
dead. With that proof in place, the roadmap item can be marked done.
Constraint: User required fresh cargo fmt/clippy/test evidence before closing any backlog item
Rejected: Leave #38 open because the implementation already existed | backlog stays stale and invites duplicate work
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Reopen#38 only with a fresh same-turn repro that bypasses the current health-probe gate
Tested: cargo fmt --all --check; cargo clippy --workspace --all-targets -- -D warnings; cargo test --workspace
Not-tested: No live long-running dogfood session replay beyond existing automated coverage
ROADMAP #65: Team lanes need structured selection events (chosenItems,
skippedItems, rationale) instead of opaque prose summaries.
ROADMAP #66: Reminder/cron should auto-expire when terminal task
completes — currently keeps firing after work is done.
Source: gaebal-gajae dogfood analysis 2026-04-12
Documents the late-arriving droid output issue discovered during
ultraclaw batch processing. Sessions report completion before
file writes are fully flushed to working tree.
Source: ultraclaw dogfood 2026-04-12
ROADMAP #24 no longer reproduces on current main. Both focused plugin
lifecycle tests pass in isolation and the current full workspace test run
includes them as green, so the backlog entry was stale rather than still
actionable.
Constraint: User explicitly required re-verifying with cargo fmt, cargo clippy --workspace --all-targets -- -D warnings, and cargo test --workspace before closeout
Rejected: Leave #24 open without a fresh repro | keeps the immediate backlog inaccurate and invites duplicate work
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Reopen#24 only with a fresh parallel-execution repro on current main
Tested: cargo fmt --all --check; cargo clippy --workspace --all-targets -- -D warnings; cargo test --workspace; cargo test -p rusty-claude-cli build_runtime_runs_plugin_lifecycle_init_and_shutdown -- --nocapture; cargo test -p plugins plugin_registry_runs_initialize_and_shutdown_for_enabled_plugins -- --nocapture
Not-tested: No synthetic stress harness beyond the existing workspace-parallel run
ROADMAP #37 was still open even though several earlier backlog items were
already closed. This change removes the local login/logout surface, stops
startup auth resolution from treating saved OAuth credentials as a supported
path, and updates diagnostics/help to point users at ANTHROPIC_API_KEY or
ANTHROPIC_AUTH_TOKEN only.
While proving the change with the user-requested workspace gates, clippy
surfaced additional pre-existing warning failures across the Rust workspace.
Those were cleaned up in-place so the required `cargo fmt`, `cargo clippy
--workspace --all-targets -- -D warnings`, and `cargo test --workspace`
sequence now passes end to end.
Constraint: User explicitly required full-workspace fmt/clippy/test before commit/push
Constraint: Existing dirty leader worktree had to be stashed before attempted OMX team worktree launch
Rejected: Keep login/logout but hide them from help | left unsupported auth flow and saved OAuth fallback intact
Rejected: Stop after ROADMAP #37 targeted tests | did not satisfy required full-workspace verification gate
Confidence: medium
Scope-risk: moderate
Reversibility: clean
Directive: Do not reintroduce saved OAuth as a silent Anthropic startup fallback without an explicit supported auth policy
Tested: cargo fmt --all --check; cargo clippy --workspace --all-targets -- -D warnings; cargo test --workspace
Not-tested: Remote push effects beyond origin/main update
ROADMAP #41 was still leaving a phantom-completion class open: managed
sessions could be resumed from the wrong workspace, and the CLI/runtime
paths were split between partially isolated storage and older helper
flows. This squashes the verified team work into one deliverable that
routes managed session operations through the per-worktree SessionStore,
rejects workspace mismatches explicitly, extends lane-event taxonomy for
workspace mismatch reporting, and updates the affected CLI regression
fixtures/docs so the new contract is enforced without losing same-
workspace legacy coverage.
Constraint: Keep same-workspace legacy flat sessions readable while blocking cross-worktree misuse
Constraint: No new dependencies; stay within the ROADMAP #41 changed-file scope
Rejected: Leave team auto-checkpoint history as final branch state | noisy/non-lore history for a single roadmap fix
Confidence: high
Scope-risk: moderate
Reversibility: clean
Directive: Preserve workspace_root validation on future resume/load helpers; do not reintroduce path-only fallback without equivalent mismatch checks
Tested: cargo test -p runtime session_control -- --nocapture; cargo test -p rusty-claude-cli resume -- --nocapture; cargo test -p rusty-claude-cli --test cli_flags_and_config_defaults; cargo test -p rusty-claude-cli --test output_format_contract; cargo test -p rusty-claude-cli --test resume_slash_commands; cargo test --workspace --exclude compat-harness; cargo check --workspace --all-targets; git diff --check
Not-tested: cargo clippy --workspace --all-targets -- -D warnings (pre-existing failures in unchanged rust/crates/rusty-claude-cli/build.rs)
Related: ROADMAP #41
Implements ROADMAP #40: Show warnings for broken/missing plugin manifests
instead of silently failing.
- Add PluginLoadFailure import
- New render_plugins_report_with_failures() function
- Shows ⚠️ warnings for failed plugin loads with error details
- Updates ROADMAP.md to mark #40 in progress
Ultraclaw droid session: ultraclaw-03-broken-plugins
#54 circular 'Did you mean /X?' for spec commands with no parse arm (done)
#55 /session list unsupported in resume mode (done)
#56 --resume no-command ignores --output-format json (done)
#57 session load errors bypass --output-format json (done)
'cargo install agent-code' installs 'agent.exe' (Windows) / 'agent'
(Unix), NOT 'agent-code'. Previous note said "binary name is 'agent-code'"
which sent users to the wrong command.
Updated the install warning to show the actual binary name.
ROADMAP #53 filed: package vs binary name mismatch in the install path.
The claw-code crate on crates.io is a deprecated stub. cargo install
claw-code succeeds but places claw-code-deprecated.exe, not claw.
Running it only prints 'claw-code has been renamed to agent-code'.
Previous note only warned about 'clawcode' (no hyphen) — the actual trap
is the hyphenated name.
Updated the warning block with explicit caution: do not use
'cargo install claw-code', install agent-code or build from source.
ROADMAP #52 filed.
Users were hitting:
- bash: cargo: command not found (Rust not installed or not on PATH)
- C:\... vs /c/... path confusion in Git Bash
- MINGW64 prompt misread as broken install
New '### Windows setup' section in README covers:
1. Install Rust via rustup.rs
2. Open Git Bash (MINGW64 is normal)
3. Verify cargo --version / run . ~/.cargo/env if missing
4. Use /c/Users/... paths
5. Clone + build + run steps
WSL2 tip added for lower-friction alternative.
ROADMAP #51 filed.
PowerShell tool is registered as danger-full-access regardless of command
semantics. Workspace-write sessions still require escalation for read-only
in-workspace commands (Get-Content, Get-ChildItem, etc.).
Root cause: mvp_tool_specs registers PowerShell and bash both with
PermissionMode::DangerFullAccess unconditionally. Fix needs command-level
heuristic analysis to classify read-only in-workspace commands at
WorkspaceWrite rather than DangerFullAccess.
Source: tanishq_devil in #claw-code 2026-04-10; traced by gaebal-gajae.
Before:
error: failed to extract manifests: No such file or directory (os error 2)
After:
error: failed to extract manifests: No such file or directory (os error 2)
looked in: /Users/yeongyu/clawd/claw-code/rust
The workspace_dir is computed from CARGO_MANIFEST_DIR at compile time and
only resolves correctly when running from the build tree. Surfacing the
resolved path lets users understand immediately why it fails outside the
build context.
ROADMAP #45 root cause (build-tree-only path) remains open.
When claw --output-format json hits an error, the error was previously
printed as plain prose to stderr, making it invisible to downstream tooling
that parses JSON output. Now:
{"type":"error","error":"api returned 401 ..."}
Detection: scan argv at process exit for --output-format json or
--output-format=json. Non-JSON error path unchanged. 156 CLI tests pass.
#21 (Resumed /status JSON parity gap): resolved by the broader
Resumed local-command JSON parity gap work tracked as #26. Re-verified
on main HEAD 8dc6580 — the regression test passes.
#29 (CLI provider dispatch hardcoded to Anthropic): landed at 8dc6580.
ApiProviderClient dispatch now routes correctly based on
detect_provider_kind. Original filing preserved as trace record.
#28 error-copy improvements landed on ff1df4c but real users (nicma,
Jengro) hit `error: missing Anthropic credentials` within hours when
using `--model openai/gpt-4` with OPENAI_API_KEY set and all
ANTHROPIC_* env vars unset on main.
Traced root cause in build_runtime_with_plugin_state at line ~6244:
AnthropicRuntimeClient::new() is hardcoded. BuiltRuntime is
statically typed as ConversationRuntime<AnthropicRuntimeClient, ...>.
providers::detect_provider_kind() computes the right routing at the
metadata layer but the runtime client is never dispatched.
Files #29 with the detailed trace + a focused action plan:
DynamicApiClient enum wrapping Anthropic + OpenAiCompat variants,
retype BuiltRuntime, dispatch in build_runtime based on
detect_provider_kind, integration test with mock OpenAI-compat
server.
#28 is marked partial — the error-copy improvements are real and
stayed in, but the routing gap they were meant to cover is the
actual bug and needs #29 to land.
Filed from live #claw-code dogfood on 2026-04-08 where two real users
hit adjacent auth confusion within minutes:
- varleg set OPENAI_API_KEY for OpenRouter but prefix routing didn't
win because the model name wasn't prefixed with openai/; unsetting
ANTHROPIC_API_KEY then hit MissingApiKey with no hint that the
OpenAI path was already configured
- stanley078852 put an sk-ant-* key in ANTHROPIC_AUTH_TOKEN instead
of ANTHROPIC_API_KEY, causing claw to send it as
Authorization: Bearer sk-ant-..., which Anthropic rejects at the
edge with 401 Invalid bearer token
Both fixes delivered live in #claw-code as direct replies, but the
pattern is structural: the error surface doesn't bridge HTTP-layer
symptoms back to env-var choice.
Action block spells out a single main-side PR with three
improvements: (a) MissingCredentials hint when an adjacent
provider's env var is already set, (b) 401-on-Anthropic hint when
bearer token starts with sk-ant-, (c) 'which env var goes where'
paragraph in both README matrices mapping sk-ant-* -> x-api-key and
OAuth access token -> Authorization: Bearer.
All three improvements are unit-testable against ApiError::fmt
output with no HTTP calls required.
The original ROADMAP #25 entry claimed the root cause was missing
exec bits on generated hook scripts. That was wrong — a chmod-only
fix (4f7b674) still failed CI. The actual bug was output_with_stdin
unconditionally propagating BrokenPipe from write_all when the child
exits before the parent finishes writing stdin.
Updated per gaebal-gajae's direction: actual fix, hygiene hardening,
and regression guard are now clearly separated. Added a meta-lesson
about Broken pipe ambiguity in fork/exec paths so future investigators
don't cargo-cult the same wrong first theory.
Filing per gaebal-gajae's status summary at message 1491322807026454579
in #clawcode-building-in-public, with corrected scope after re-running
`cargo test -p rusty-claude-cli` against main HEAD (79da4b8): the 11
deterministic failures only reproduce on dev/rust, not main, so this is
a dev/rust catchup item rather than a main regression.
Two-layered root cause documented:
1. dev/rust `parse_args` eagerly validates user plugin hook scripts
exist on disk before returning a CliAction
2. dev/rust test harness does not redirect $HOME/XDG_CONFIG_HOME to a
fixture (no `env_lock` equivalent — main has 30+ env_lock hits, dev
has zero)
Together they make dev/rust `cargo test -p rusty-claude-cli` fail on
any clean clone whose owner has a half-installed user plugin in
~/.claude/plugins/installed/. main has both the env_lock test isolation
AND the parse_args/hook-validation decoupling already; dev/rust is just
behind on the merge train.
Action block in #27 spells out backporting env_lock + the parse_args
decoupling so the next dev/rust release picks this up.
Linux CI keeps tripping over
`plugins::hooks::tests::collects_and_runs_hooks_from_enabled_plugins`
with `Broken pipe (os error 32)` when the hook runner tries to spawn a
child shell script that was written by `write_hook_plugin` without the
execute bit set. Fails on first attempt, passes on rerun (observed in CI
runs 24120271422 and 24120538408). Passes consistently on macOS.
Since issues are disabled on the repo, recording as ROADMAP backlog
item #25 in the Immediate Backlog P2 cluster next to the related plugin
lifecycle flake at #24. Action block spells out the chmod +755 fix in
`write_hook_plugin` plus the regression guard.
Concrete follow-up captured from today's dogfood session:
A single hung test (oversized-request preflight, 6 minutes per attempt
after `be561bf` silently swallowed count_tokens errors) crashed the
`cargo test --workspace` job before downstream crates could run, hiding
6 separate pre-existing CLI regressions until `8c6dfe5` + `5851f2d`
restored the fast-fail path.
Two new acceptance criteria for #9:
- per-test timeouts in CI so one hang cannot mask other failures
- distinguish `test.hung` from generic test failures in worker reports
Filed: openai/ prefix model misrouting (fixed in 0530c50).
Documents root cause, fix, and the architectural lesson:
- metadata_for_model is the canonical extension point for new providers
- auth-sniffer fallback order must never override explicit model-name prefix
- regression test locked in to guard this invariant
Closes the ambiguity gaebal-gajae flagged: downstream tooling was left
guessing which integration surface to build against.
Decision: claw state + .claw/worker-state.json is the blessed contract.
HTTP endpoint not scheduled. Rationale documented:
- plugin scope constraint (can't add routes to opencode serve)
- file polling has lower latency and fewer failure modes than HTTP
- HTTP would require upstreaming to sst/opencode or a fragile sidecar
Clawhip integration contract documented:
- poll .claw/worker-state.json after WorkerCreate
- seconds_since_update > 60 in trust_required = stall signal
- WorkerResolveTrust to unblock, WorkerRestart to reset