The CLI already reframes direct preflight and provider oversized-request
errors, but retry-wrapped provider failures still fell back to the generic
retry-exhausted surface because the user-visible formatter keyed off the
safe failure class. Route formatting through nested context-window
detection so wrapped provider failures keep the same compact/reduce-scope
guidance.
Constraint: Keep the fix UX-scoped without widening broader failure classification behavior
Rejected: Reorder safe_failure_class for all RetriesExhausted errors | broader semantic change than needed for this issue
Confidence: high
Scope-risk: narrow
Directive: Keep context-window rendering keyed to nested error inspection so provider wrappers do not lose recovery guidance
Tested: cargo fmt --check; cargo test -p rusty-claude-cli context_window; cargo test -p api oversized
Not-tested: Full workspace test suite
The CLI still treated every /skills payload other than list/install/help as local usage text, so skills that appeared in /skills could not actually be invoked. This restores prompt dispatch for /skills <skill> [args], keeps list/install on the local path, and shares skill resolution with the Skill tool so project-local and legacy /commands entries resolve consistently.
Constraint: --resume local slash execution still only supports local commands without provider turns
Rejected: Implement full resumed prompt-turn execution for /skills | larger behavior change outside this bugfix
Rejected: Keep separate skill lookups in tools and commands | drift already caused listing/invocation mismatches
Confidence: high
Scope-risk: moderate
Reversibility: clean
Directive: Keep /skills discovery, CLI prompt dispatch, and Tool Skill resolution on the same registry semantics
Tested: cargo fmt --all; cargo clippy -p commands -p tools -p rusty-claude-cli --all-targets -- -D warnings; cargo test --workspace -- --nocapture
Not-tested: Live provider-backed /skills invocation against external skill packs in an interactive REPL
Dogfood showed oversized requests still surfacing as raw hard errors, even when claw could tell the user exactly how to recover. This keeps context-window failures classified, recognizes the same failure when it comes back from a provider response, and renders recovery steps that point operators at the existing compaction and fresh-session paths instead of a provider-style dump.
Constraint: Keep the failure class explicit so automation and operators can still distinguish context-window exhaustion from generic provider failures
Constraint: Reuse existing /compact and session-reset UX instead of inventing a new recovery workflow
Rejected: Auto-run compaction on failure | mutates session state on an error path the user may want to inspect first
Rejected: Only prettify local preflight failures | provider-returned context-window errors would still leak raw failure text
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Keep provider-side context-window detection aligned with real oversized-request messages before broadening the marker list
Tested: cargo fmt --all --check
Tested: cargo test -p api
Tested: cargo test -p rusty-claude-cli
Tested: cargo clippy -p api -p rusty-claude-cli --all-targets -- -D warnings
Not-tested: cargo test --workspace
Resumed slash dispatch was still dropping back to prose for several JSON-capable local commands, which forced automation to special-case direct CLI invocations versus --resume flows. This routes resumed local-command handlers through the same structured JSON payloads used by direct status, sandbox, inventory, version, and init commands, and records the inventory parity audit result in the roadmap.
Constraint: Text-mode resumed output must stay unchanged for existing shell users
Rejected: Teach callers to scrape resumed text output | brittle and defeats the JSON contract
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: When a direct local command has a JSON renderer, keep resumed slash dispatch on the same serializer instead of adding one-off format branches
Tested: cargo fmt --check; cargo test --workspace; cargo clippy --workspace --all-targets -- -D warnings
Not-tested: Live provider-backed REPL resume flows outside the local test harness
`claw agents --output-format json` was still wrapping the text report,
which meant automation could not distinguish empty inventories from
populated agent definitions. Add a dedicated structured handler in the
commands crate, wire the CLI to it, and extend the contracts to cover
both empty and populated agent listings.
Constraint: Keep text-mode `claw agents` output unchanged while aligning JSON behavior with existing structured inventory handlers
Rejected: Parse the text report into JSON in the CLI layer | brittle duplication and no reusable structured handler
Confidence: high
Scope-risk: narrow
Directive: Keep inventory subcommands on dedicated structured handlers instead of serializing human-readable reports
Tested: cargo test -p commands renders_agents_reports_as_json; cargo test -p rusty-claude-cli --test output_format_contract; cargo test --workspace; cargo fmt --check; cargo clippy --workspace --all-targets -- -D warnings
Not-tested: Manual invocation of `claw agents --output-format json` outside automated tests
Issue #22 was triggered by generic upstream fatal wrappers that only surfaced 'Something went wrong', which left repeated Jobdori-style failures opaque in the CLI. Capture provider request ids on error responses, classify the known generic wrapper as provider_internal, and prefix the user-visible runtime error with the failure class plus session/trace identifiers so operators can correlate the failure quickly.
Constraint: Keep the fix small and user-safe without redesigning the broader runtime error taxonomy
Constraint: Preserve existing non-generic error text unless the wrapper is the known opaque fatal surface
Rejected: Broadly rewriting every runtime error into classified envelopes | unnecessary scope expansion for issue #22
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: If more opaque wrappers appear, extend the marker list and classification helper rather than reintroducing raw wrapper text alone
Tested: cargo test -p api detects_generic_fatal_wrapper_and_classifies_it_as_provider_internal -- --nocapture; cargo test -p api retries_exhausted_preserves_nested_request_id_and_failure_class -- --nocapture; cargo test -p rusty-claude-cli opaque_provider_wrapper_surfaces_failure_class_session_and_trace -- --nocapture; cargo test -p rusty-claude-cli retry_exhaustion_preserves_internal_failure_class_for_generic_provider_wrapper -- --nocapture; cargo test --workspace
Not-tested: Live upstream reproduction of the Jobdori failure against a real provider session
The resumed slash-command path built a reduced status JSON payload by hand, so it drifted from the fresh status schema and dropped metadata like model, permission mode, workspace counters, and sandbox details. Reuse a shared status JSON builder for both code paths and tighten the resume regression tests to lock parity in place.
Constraint: Resume mode does not carry an active runtime model, so restored sessions continue to report the existing restored-session sentinel value
Rejected: Copy the fresh status JSON shape into the resume path again | would recreate the same schema drift risk
Confidence: high
Scope-risk: narrow
Directive: Keep resumed and fresh /status JSON on the same helper so future schema changes stay in parity
Tested: Reproduced failure in temporary HEAD worktree with strengthened resumed_status_command_emits_structured_json_when_requested
Tested: cargo test -p rusty-claude-cli resumed_status_command_emits_structured_json_when_requested --test resume_slash_commands -- --exact --nocapture
Tested: cargo test -p rusty-claude-cli doctor_and_resume_status_emit_json_when_requested --test output_format_contract -- --exact --nocapture
Tested: cargo test --workspace
Tested: cargo fmt --check
Tested: cargo clippy --workspace --all-targets -- -D warnings
Finish the remaining roadmap work by making direct CLI JSON output deterministic across the non-interactive surface, restoring the degraded-startup MCP test as a real workspace test, and adding branch-lock plus commit-lineage primitives so downstream lane consumers can distinguish superseded worktree commits from canonical lineage.
Constraint: Keep the user-facing config namespace centered on .claw while preserving legacy fallback discovery for compatibility
Constraint: Verification needed to stay clean-room and reproducible from the checked-in workspace alone
Rejected: Leave the output-format contract implied by ad-hoc smoke runs only | too easy for direct CLI regressions to slip back into prose-only output
Rejected: Keep commit provenance as free-form detail text | downstream consumers need structured branch/worktree/supersession metadata
Confidence: medium
Scope-risk: moderate
Directive: Extend the JSON contract through the same direct CLI entrypoints instead of adding one-off serializers on parallel code paths
Tested: python .github/scripts/check_doc_source_of_truth.py
Tested: cd rust && cargo fmt --all --check
Tested: cd rust && cargo test --workspace
Tested: cd rust && cargo clippy -p commands -p tools -p rusty-claude-cli --all-targets --no-deps -- -D warnings
Not-tested: full cargo clippy --workspace --all-targets -- -D warnings still reports unrelated pre-existing runtime lint debt outside this change set
The skills and mcp inventory handlers were still emitting prose tables even when the global --output-format json flag was set. This wires structured JSON renderers into the command handlers and CLI dispatch so direct invocations and resumed slash-command execution both return machine-readable payloads while preserving existing text output in the REPL path.
Constraint: Must preserve existing text output and help behavior for interactive slash commands
Rejected: Parse existing prose tables into JSON at the CLI edge | brittle and loses structured fields
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Keep text and JSON variants driven by the same command parsing branches so --output-format stays deterministic across entry points
Tested: cargo test -p commands
Tested: cargo test -p rusty-claude-cli
Not-tested: Manual invocation against a live user skills registry or external MCP services
Promote doctor into a real top-level CLI action, reuse the same local report for resumed and REPL doctor invocations, and intercept doctor/status/sandbox help flags before prompt-mode dispatch. The parser change also closes the help fallthrough that previously wandered into runtime startup for local-info commands.
Constraint: Preserve prompt shorthand for normal multi-word text input while fixing exact local subcommand help paths
Rejected: Route \7[1G[2K[m⠋ 🦀 Thinking...[0m8[1G[2K[m✘ ❌ Request failed
[0m through prompt/slash guidance | still shells out through the wrong surface and keeps health checks hidden
Rejected: Reuse the status report as doctor output | status does not explain auth/config health or expose a dedicated diagnostic summary
Confidence: high
Scope-risk: narrow
Directive: Keep doctor local-only unless an explicit network probe is intentionally added and separately tested
Tested: cargo build -p rusty-claude-cli; cargo test -p rusty-claude-cli; cargo run -p rusty-claude-cli -- doctor --help; CLAW_CONFIG_HOME=/tmp/tmp.7pm9SVzOPN ANTHROPIC_API_KEY= ANTHROPIC_AUTH_TOKEN= cargo run -p rusty-claude-cli -- doctor
Not-tested: direct /doctor outside the REPL remains interactive-only
The global --output-format json flag already reached prompt-mode responses, but
status and sandbox still bypassed that path and printed human-readable tables.
This change threads the selected output format through direct command aliases
and resumed slash-command execution so status queries emit valid structured
JSON instead of mixed prose.
It also adds end-to-end regression coverage for direct status/sandbox JSON
and resumed /status JSON so shell automation can rely on stable parsing.
Constraint: Global output formatting must stay compatible with existing text-mode reports
Rejected: Require callers to scrape text status tables | fragile and breaks automation
Confidence: high
Scope-risk: narrow
Directive: New direct commands that honor --output-format should thread the format through CliAction and resumed slash execution paths
Tested: cargo build -p rusty-claude-cli
Tested: cargo test -p rusty-claude-cli -- --nocapture
Tested: cargo test --workspace
Tested: cargo run -q -p rusty-claude-cli -- --output-format json status
Tested: cargo run -q -p rusty-claude-cli -- --output-format json sandbox
Not-tested: cargo clippy --workspace --all-targets -- -D warnings (fails in pre-existing runtime files unrelated to this change)
The worker boot registry now exposes the requested lifecycle states, emits structured trust and prompt-delivery events, and recovers from shell or wrong-target prompt delivery by replaying the last prompt. Supporting fixes keep MCP remote config parsing backwards-compatible and make CLI argument parsing less dependent on ambient config and cwd state so the workspace stays green under full parallel test runs.
Constraint: Worker prompts must not be dispatched before a confirmed ready_for_prompt handshake
Constraint: Prompt misdelivery recovery must stay minimal and avoid new dependencies
Rejected: Keep prompt_accepted and blocked as public lifecycle states | user requested the narrower explicit state set
Rejected: Treat url-only MCP server configs as invalid | existing CLI/runtime tests still rely on that shorthand
Confidence: high
Scope-risk: moderate
Reversibility: clean
Directive: Preserve prompt_in_flight semantics when extending worker boot; misdelivery detection depends on it
Tested: cargo build --workspace; cargo test --workspace
Not-tested: Live tmux worker delivery against a real external coding agent pane
Add MCP structured degraded-startup classification (P2.10):
- classify MCP failures as startup/handshake/config/partial
- expose failed_servers + recovery_recommendations in tool output
- add mcp_degraded output field with server_name, failure_mode, recoverable
Canonical lane event schema (P2.7):
- add LaneEventName variants for all lifecycle states
- wire LaneEvent::new with full 3-arg signature (event, status, emitted_at)
- emit typed events for Started, Blocked, Failed, Finished
Fix let mut executor for search test binary
Fix lane_completion unused import warnings
Note: mcp_stdio::manager_discovery_report test has pre-existing failure on clean main, unrelated to this commit.
Replace with_current_dir+render_diff_report() with direct render_diff_report_for(&root)
calls in the three diff-report tests. The env_lock mutex only serializes within one
test binary; cargo test --workspace runs binaries in parallel, so set_current_dir races
were possible across binaries. render_diff_report_for(cwd) accepts an explicit path
and requires no global state mutation, making the tests reliably green under full
workspace parallelism.
This resolves the stale-branch merge against origin/main, keeps the MCP runtime wiring, and preserves prompt-approved CLI tool execution after the mock parity harness additions landed upstream.
Constraint: Branch had to absorb origin/main changes through a contentful merge before more MCP work
Constraint: Prompt-approved runtime tool execution must continue working with new CLI/mock parity coverage
Rejected: Keep permission enforcer attached inside CliToolExecutor for conversation turns | caused prompt-approved bash parity flow to fail as a tool error
Rejected: Defer the merge and continue on stale history | would leave the lane red against current main
Confidence: high
Scope-risk: moderate
Reversibility: clean
Directive: Runtime permission policy and executor-side permission enforcement are separate layers; do not reapply executor enforcement to conversation turns without revalidating mock parity harness approval flows
Tested: cargo test -p rusty-claude-cli --test mock_parity_harness -- --nocapture; cargo test -p rusty-claude-cli -- --nocapture; cargo test --workspace -- --nocapture
Not-tested: Additional live remote/provider scenarios beyond the existing workspace suite
This wires configured MCP servers into the CLI/runtime path so discovered
MCP tools, resource wrappers, search visibility, shutdown handling, and
best-effort discovery all work together instead of living as isolated
runtime primitives.
Constraint: Keep non-MCP startup flows working without new required config
Constraint: Preserve partial availability when one configured MCP server fails discovery
Rejected: Fail runtime startup on any MCP discovery error | too brittle for mixed healthy/broken server configs
Rejected: Keep MCP support runtime-only without registry wiring | left discovery and invocation unreachable from the CLI tool lane
Confidence: high
Scope-risk: moderate
Reversibility: clean
Directive: Runtime MCP tools are registry-backed but executed through CliToolExecutor state; keep future tool-registry changes aligned with that split
Tested: cargo test -p runtime mcp -- --nocapture; cargo test -p tools -- --nocapture; cargo test -p rusty-claude-cli -- --nocapture; cargo test --workspace -- --nocapture
Not-tested: Live remote MCP transports (http/sse/ws/sdk) remain unsupported in the CLI execution path
Tests relying on PermissionMode::DangerFullAccess as default were
flaky under --workspace runs because other tests set
RUSTY_CLAUDE_PERMISSION_MODE without cleanup. Added env_lock() and
explicit env var removal to 7 affected tests.
Fixes: workspace-level cargo test flake (1 random test fails per run)
Remove the #[ignore] gate from startup_banner_mentions_workflow_completions
by injecting a dummy ANTHROPIC_API_KEY. The test exercises LiveCli banner
rendering, not API calls. Cleanup env var after test.
Test suite now 102/102 in CLI crate (was 101 + 1 ignored).
Inject a dummy ANTHROPIC_API_KEY for
build_runtime_runs_plugin_lifecycle_init_and_shutdown so the test
exercises plugin init/shutdown without requiring real credentials.
The API client is constructed but never used for streaming.
Clean up the env var after the test to avoid polluting parallel tests.