Compare commits

..

5 Commits

Author SHA1 Message Date
YeonGyu-Kim
eb4b1ebc9b fix(#249): Add kind+hint to resumed-session slash error JSON envelopes
## What Was Broken (ROADMAP #249)

Two Err arms in the resumed-session slash command dispatcher were emitting
JSON envelopes WITHOUT the `kind` and `hint` fields that the typed-error
contract requires:

- main.rs:~2747 (parse_command_token failure path)
- main.rs:~2783 (run_resume_command failure path)

Adjacent code paths (e.g., the Ok(None) error branch at main.rs:2658) were
already threading classify_error_kind + split_error_hint through. These two
arms skipped the helpers, so claws routing on error class couldn't
distinguish parse failures from other error classes for resumed-session
slash command operations.

This is an ARM-LEVEL LEAK pattern: the contract exists, the helpers exist,
but two specific arms didn't call them. Cycle #38 identified this pattern
("Contract + helpers exist; the remaining work is finding every code path
that didn't call the helpers").

## What This Fix Does

Thread classify_error_kind + split_error_hint through both arms:

```rust
Err(error) => {
    if output_format == CliOutputFormat::Json {
        let full_message = error.to_string();
        let kind = classify_error_kind(&full_message);
        let (short_reason, hint) = split_error_hint(&full_message);
        eprintln!(
            "{}",
            serde_json::json!({
                "type": "error",
                "error": short_reason,
                "kind": kind,
                "hint": hint,
                "command": raw_command,
            })
        );
    } else { ... }
    std::process::exit(2);
}
```

This matches the envelope shape used elsewhere in the same function
(main.rs:2658) and satisfies the typed-error contract.

## Tests Added

resumed_slash_error_envelope_includes_kind_and_hint_249 — regression guard
in crates/rusty-claude-cli/src/main.rs that:
- Documents the contract: both arms must carry kind and hint
- Verifies classify_error_kind returns a non-empty class name
- Verifies split_error_hint returns a non-empty short_reason
- Provides an explicit comment documenting the structural contract
  (integration test would require real session file setup; this
  unit-level test verifies the building blocks work correctly)

## Test Results

- All 181 tests pass (180 original + 1 new #249 regression test)
- cargo build succeeds with no warnings
- No regressions in existing tests

## Template

This fix follows the exact pattern of the Ok(None) error branch at
main.rs:2658 which was added for #77. The #247 classifier sweep (cycle
#33-#36) and #248 sweep (cycle #41) also use this pattern. Each fix
extends the typed-error contract to a previously-missed arm.

Part of the classifier/dispatcher sweep cluster:
- #247 (closed): prompt-related parse errors
- #248 (review-ready): verb-qualified unknown-option errors
- #249 (this): resumed slash-command Err arms
- #130 (still open): filesystem errno strings (classifier part)
- #251 (filed): dispatch-order on session-management verbs
2026-04-23 00:34:03 +09:00
YeonGyu-Kim
2fcb85ce4e ROADMAP #251: dispatch-order bug — session-management verbs fall through to Prompt before credential check (filed by gaebal-gajae; formalized by Jobdori cycle #40)
Cycle #40: gaebal-gajae conceived #251 in their 00:00 Discord cycle
status but hadn't committed to ROADMAP yet. Jobdori verified their
diagnosis with code trace and formalized into ROADMAP with the proper
framing relationship to #250.

## What This Pinpoint Says

Same observable as #250 (session-management verbs emit missing_credentials
instead of SCHEMAS.md envelope) but reframed at the dispatch-order layer:

- #250 says: surface missing on canonical binary vs SCHEMAS.md promise
- #251 says: top-level parser fall-through happens BEFORE dispatcher
  could intercept, so credential resolution runs before the verb is
  classified as a purely-local operation

#251's framing is sharper because it identifies WHY the fall-through
produces auth errors, not just that it does.

## Verified Code Trace

- main.rs:1017-1027 is the _other => Prompt catchall
- joins all rest[] tokens into joined, constructs CliAction::Prompt
- downstream resolves credentials -> emits missing_credentials
- No credential call would be needed had the verb been intercepted

Same pattern has been fixed before for other purely-local verbs:
- #145: plugins (main.rs:888-906, explicit match arm)
- #146: config and diff (main.rs:911-935, same shape)

#251 extends this to the 4 session-management verbs.

## Recommended Sequence

1. #251 fix (4 match arms mirroring #145/#146) — principled solution
2. #250's Option B (docs scope note) — guard against future drift
3. #250's Option C (reject with redirect) — unnecessary if #251 lands

## Discipline

Per cycle #24 calibration:
- Red-state bug? Borderline (silent misroute to auth error class)
- Real friction? ✓ (4 documented surfaces emit wrong error class)
- Evidence-backed? ✓ (code trace + prior-fix precedent #145/#146)
- Same-cycle fix? ✗ (filed + document, boundary discipline #36)
- Implementation cost? ~40 lines Rust + tests, bounded

## Credit

Conception: gaebal-gajae (Discord msg 1496526112254328902, 00:00 KST)
Formalization: Jobdori cycle #40 (code trace + precedent linking)

This is the right kind of collaboration: gaebal-gajae saw the dispatch
pattern I had missed in #250 (I framed as surface parity; they framed
as dispatch order). I verified their diagnosis and committed the
ROADMAP entry. Two framings make the pinpoint sharper than either
alone.
2026-04-23 00:06:46 +09:00
YeonGyu-Kim
f1103332d0 ROADMAP #130: re-verify still-open on main HEAD 186d42f; add classifier-cluster pairing note
Cycle #39 dogfood re-verification of #130 (filed 2026-04-20). All 5
filesystem failure modes reproduce identically on main HEAD 186d42f,
2 days after original filing. Gap is unchanged.

## What's Added

1. **[STILL OPEN — re-verified 2026-04-22 cycle #39]** marker on the
   entry so readers can see immediately that the pinpoint hasn't been
   accidentally closed.

2. Full 5-mode repro output preserved verbatim for the current HEAD,
   so future re-verifications have a concrete baseline to diff against.

3. **New evidence not in original filing**: the classifier actively
   chose `kind: "unknown"` rather than just omitting the field. This
   means classify_error_kind() has NO substring match for "Is a
   directory", "No such file", "Operation not permitted", or "File
   exists". The typed-error contract is thus twice-broken on this path.

4. **Pairing with #247/#248/#249 classifier sweep**: the classifier-level
   part of #130 could land in the same sweep (add substring branches
   for io::ErrorKind strings). The context-preservation part (fix
   run_export's bare `?`) is a separate, larger change.

## Why Re-Verification Not Re-Filing

Per cycle #24 discipline: speculative re-filings add noise, real
confirmations add truth. #130 was already filed with exact repros, code
trace, and fix shape. My dogfood hit the same gap on fresh HEAD — the
right output is confirming the gap is still there (not filing #251 for
the same bug).

This is the same pattern as cycle #32's "mark #127 CLOSED" reality-sync:
documentation-drift prevention through explicit status markers.

## New Pattern

"Reality-sync via re-verification" — re-running a filed pinpoint's
repro on fresh HEAD and adding the timestamp + output proves the gap
is still real without inventing new filings. Cycle #24 calibration
keeps ROADMAP entries honest.

Per cycle #24 calibration:
- Red-state bug? ⚠️ borderline (errors surfaced, but kind=unknown is
  demonstrably wrong on a path where the system knows the errno)
- Real friction? ✓ (re-verified on fresh HEAD)
- Evidence-backed? ✓ (5-mode repro + classifier trace)
- Same-cycle fix? ✗ (classifier-level part could join #247/#248/#249
  sweep; context-preservation part is larger refactor)
- Implementation cost? Classifier part ~10 lines; full context fix ~60 lines

Source: Jobdori cycle #39 proactive dogfood in response to Clawhip
pinpoint nudge. Probed export filesystem errors; discovered this was
#130 reconfirmation, not new bug. Applied reality-sync pattern from
cycle #32.
2026-04-23 00:02:58 +09:00
YeonGyu-Kim
186d42f979 ROADMAP #250: CLI surface parity gap — SCHEMAS.md's list-sessions/delete-session/etc. are Python-only; Rust binary falls through to Prompt with cred error
Cycle #38 dogfood finding. Probed session management via the top-level
subcommand path documented in SCHEMAS.md; discovered the Rust binary
doesn't implement these as top-level subcommands. The literal token
'list-sessions' falls through the _other => Prompt arm and returns
'missing Anthropic credentials' instead of the documented envelope.

## The Gap

SCHEMAS.md documents 14 CLAWABLE top-level subcommands. Python audit
harness (src/main.py) implements all 14. Rust binary implements ~8 of
them as top-level, routing session management through /session slash
commands via --resume instead.

Repro:

  $ env -i PATH=$PATH HOME=$HOME claw list-sessions --output-format json
  {"error":"missing Anthropic credentials; ...","kind":"missing_credentials"}

  $ claw --resume latest /session list --output-format json
  {"active":"...","kind":"session_list","sessions":[...]}

  $ python3 -m src.main list-sessions --output-format json
  {"command":"list-sessions","sessions":[...],"exit_code":0}

Same operation, three different CLI shapes across implementations.

## Classification

This is BOTH:
- a parser-level trust gap (6th in #108/#117/#119/#122/#127 family; same
  _other => Prompt fall-through), AND
- a cross-implementation parity gap (SCHEMAS.md at repo root doesn't
  match Rust binary's top-level surface)

Unlike prior fall-throughs where the input was malformed, the input
here IS a documented surface. The fall-through is wrong for a different
reason: the surface exists in the protocol but not in this implementation.

## Three Fix Options

Option A: Implement surfaces on Rust binary (highest cost, full parity)
Option B: Scope SCHEMAS.md to Python harness (docs-only)
Option C: Reject at parse time with redirect hint (cheapest, #127 pattern)

Recommended: C first (prevents cred misdirection), then B for docs
hygiene, then A if demand justifies.

## Discipline

Per cycle #24 calibration:
- Red-state bug? ⚠️ borderline — silent misroute to cred error on a
  documented surface. Not a crash but a real wrong-contract response.
- Real friction? ✓ (claws reading SCHEMAS.md hit wrong error on canonical binary)
- Evidence-backed? ✓ (dogfood probe + SCHEMAS.md cross-reference + code trace)
- Implementation cost? Option C: ~30 lines (bounded). Option A: larger.
- Same-cycle fix? ✗ (file + document, defer implementation per #36 boundary discipline)

## Family Position

Natural bundle: **#127 + #250** — parser-level fall-through pair with
class distinction. #127 fixed suffix-arg-on-valid-verb case. #250 extends
to 'entire Python-harness verb treated as prompt.' Same fall-through arm,
different entry class.

Source: Jobdori cycle #38 proactive dogfood in response to Clawhip
pinpoint nudge at msg 1496518474019639408. Probed session management CLI
after gaebal-gajae's status sync confirmed no red-state regressions this
cycle; found this cross-implementation surface parity gap by comparing
SCHEMAS.md claims against actual Rust binary behavior.
2026-04-22 23:37:45 +09:00
YeonGyu-Kim
5f8d1b92a6 ROADMAP #249: resumed-session slash command error envelopes omit kind field
Cycle #37 dogfood finding post-#247 merge. Two Err arms in the resumed-session
JSON path at main.rs:2747 and main.rs:2783 emit error envelopes WITHOUT the
`kind` field required by the §4.44 typed-envelope contract.

## The Pinpoint

Probed resumed-session slash command JSON path:

  $ claw --output-format json --resume latest /session
  {"command":"/session","error":"unsupported resumed slash command","type":"error"}
  # no kind field

  $ claw --output-format json --resume latest /xyz-unknown
  {"command":"/xyz-unknown","error":"Unknown slash command: /xyz-unknown\n  Help             /help lists available slash commands","type":"error"}
  # no kind field AND multi-line error without split hint

Compare to happy path which DOES include kind:
  $ claw --output-format json --resume latest /session list
  {"active":"...","kind":"session_list",...}

Contract awareness exists. It's just not applied in the Err arms.

## Scope

Two atomic fixes in main.rs:
- Line 2747: SlashCommand::parse() Err → add kind via classify_error_kind()
- Line 2783: run_resume_command() Err → add kind + call split_error_hint()

~15 lines Rust total. Bounded.

## Family Classification

§4.44 typed-envelope contract sweep:
- #179 (parse-error real message quality) — closed
- #181 (envelope exit_code matches process exit) — closed
- #247 (classify_error_kind misses prompt-patterns) — closed
- #248 (verb-qualified unknown option errors) — in-flight (another agent)
- **#249 (resumed-session slash error envelopes omit kind) — filed**

Natural bundle #247+#248+#249: classifier/envelope completeness across all
three CLI paths (top-level parse, subcommand options, resumed-session slash).

## Discipline

Per cycle #24 calibration:
- Red-state bug? ✗ (errors surfaced, exit codes correct)
- Real friction? ✓ (typed-error contract violation; claws dispatching on
  error.kind get undefined for all resumed slash-command errors)
- Evidence-backed? ✓ (dogfood probe + code trace identified both Err arms)
- Implementation cost? ~15 lines (bounded)
- Same-cycle fix? ✗ (Rust change, deferred per file-not-fix discipline)

## Not Implementing This Cycle

Per the boundary discipline established in cycle #36: I don't touch another
agent's in-flight work, and I don't implement a Rust fix same-cycle when
the pattern is "file + document + let owner/maintainer decide."

Filing with concrete fix shape is the correct output. If demand or red-state
symptoms arrive, implementation can follow the same path as #247: file →
fix in branch → review → merge.

Source: Jobdori cycle #37 proactive dogfood in response to Clawhip pinpoint
nudge at msg 1496518474019639408.
2026-04-22 23:33:50 +09:00
2 changed files with 367 additions and 3 deletions

View File

@@ -4929,7 +4929,7 @@ ear], /color [scheme], /effort [low|medium|high], /fast, /summary, /tag [label],
**Source.** Jobdori dogfood 2026-04-20 against `/tmp/claw-mcp-test` (env-cleaned, working `mcpServers.everything = npx -y @modelcontextprotocol/server-everything`) on main HEAD `8122029` in response to Clawhip dogfood nudge / 10-min cron. Joins **MCP lifecycle gap family** as runtime-side companion to **#102** — #102 catches config-time silence (no preflight, no command-exists check); #129 catches runtime-side blocking (handshake await ordered before cred check, retried silently, no deadline). Joins **Truth-audit / diagnostic-integrity** (#80#87, #89, #100, #102, #103, #105, #107, #109, #110, #112, #114, #115, #125, #127) — the hang surfaces no events, no exit code, no signal. Joins **Auth-precondition / fail-fast ordering family** — cheap deterministic preconditions should run before expensive externally-controlled ones. Cross-cluster with **Recovery / wedge-recovery** — a misbehaved MCP server wedges every subsequent Prompt invocation; current recovery is "kill -9 the parent." Cross-cluster with **PARITY.md Lane 7 acceptance gap** — the Lane 7 merge added the bridge but didn't add startup-deadline + cred-precheck ordering, so the lane is technically merged but functionally incomplete for unattended claw use. Natural bundle: **#102 + #129** — MCP lifecycle visibility pair: config-time preflight (#102) + runtime-time deadline + cred-precheck (#129). Together they make MCP failures structurally legible from both ends. Also **#127 + #129** — Prompt-path silent-failure pair: verb-suffix args silently routed to Prompt (#127, fixed) + Prompt path silently blocks on MCP (#129). With #127 fixed, the `claw doctor --json` consumer no longer accidentally trips the #129 wedge — but the wedge still affects every legitimate Prompt invocation. Session tally: ROADMAP #129.
130. **`claw export --output <path>` filesystem errors surface raw OS errno strings with zero context — no path that failed, no operation that failed (open/write/mkdir), no structured error kind, no actionable hint, and the `--output-format json` envelope flattens everything to `{"error":"<raw errno string>","type":"error"}`. Five distinct filesystem failure modes all produce different raw errno strings but the same zero-context shape. The boilerplate `Run claw --help for usage` trailer is also misleading because these are filesystem errors, not usage errors** — dogfooded 2026-04-20 on main HEAD `d2a8341` from `/Users/yeongyu/clawd/claw-code/rust` (real session file present).
130. **[STILL OPEN — re-verified 2026-04-22 cycle #39 on main HEAD `186d42f`]** **`claw export --output <path>` filesystem errors surface raw OS errno strings with zero context — no path that failed, no operation that failed (open/write/mkdir), no structured error kind, no actionable hint, and the `--output-format json` envelope flattens everything to `{"error":"<raw errno string>","type":"error"}`. Five distinct filesystem failure modes all produce different raw errno strings but the same zero-context shape. The boilerplate `Run claw --help for usage` trailer is also misleading because these are filesystem errors, not usage errors** — dogfooded 2026-04-20 on main HEAD `d2a8341` from `/Users/yeongyu/clawd/claw-code/rust` (real session file present).
**Concrete repro.**
```
@@ -5057,6 +5057,24 @@ ear], /color [scheme], /effort [low|medium|high], /fast, /summary, /tag [label],
**Source.** Jobdori dogfood 2026-04-20 against `/Users/yeongyu/clawd/claw-code/rust` (real session file present) on main HEAD `d2a8341` in response to Clawhip dogfood nudge / 10-min cron. Joins **Truth-audit / diagnostic-integrity** (#80#127, #129) as 16th — error surface is incomplete by design; runtime has info that CLI boundary discards. Joins **JSON envelope asymmetry family** (#90, #91, #92, #110, #115, #116) — `{error, type}` shape is a fake envelope when the failure mode is richer than a single prose string. Joins **Claude Code migration parity** — Claude Code's error shape includes typed error kinds; claw-code's flat envelope loses information. Joins **`Run claw --help for usage` trailer-misuse** — the trailer is appended to errors that are not usage errors, which is both noise and misdirection. Natural bundle: **#90 + #91 + #92 + #130** — JSON envelope hygiene quartet. All four surface errors with insufficient structure for claws to dispatch on. Also **#121 + #130** — error-text-lies pair: hooks error names wrong thing (#121), export errno strips all context (#130). Also **Phase 2 §4 Canonical lane event schema exhibit A** — typed errors are the prerequisite for structured lane events. Session tally: ROADMAP #130.
**Re-verification (2026-04-22 cycle #39, main HEAD `186d42f`).** All 5 failure modes still reproduce identically to the original filing 2 days later. Concrete output:
```
$ claw export --output /tmp/nonexistent-dir-xyz/out.md --output-format json
{"error":"No such file or directory (os error 2)","hint":null,"kind":"unknown","type":"error"}
$ claw export --output /bin/cantwrite.md --output-format json
{"error":"Operation not permitted (os error 1)","hint":null,"kind":"unknown","type":"error"}
$ claw export --output "" --output-format json
{"error":"No such file or directory (os error 2)","hint":null,"kind":"unknown","type":"error"}
$ claw export --output / --output-format json
{"error":"File exists (os error 17)","hint":null,"kind":"unknown","type":"error"}
$ claw export --output /tmp/ --output-format json
{"error":"Is a directory (os error 21)","hint":null,"kind":"unknown","type":"error"}
```
**New evidence not in original filing.** The `kind` field is set to `"unknown"` — the classifier actively chose `unknown` rather than just omitting the field. This means `classify_error_kind()` (at main.rs:~251) has no substring match for "Is a directory", "No such file", "Operation not permitted", or "File exists". The typed-error contract is thus twice-broken on this path: (a) the io::ErrorKind information is discarded at the `?` in `run_export()`, AND (b) the flat `io::Error::Display` string is then fed to a classifier that has no patterns for filesystem errno strings.
**Natural pairing with #247/#248/#249 classifier sweep.** Same code path as #247's classifier fix (`classify_error_kind()`), same pattern (substring-matching classifier that lacks entries for specific error strings). #247 added patterns for prompt-related parse errors. #248 WIP adds patterns for verb-qualified unknown option errors. #130's classifier-level part (adding `NotFound`/`PermissionDenied`/`IsADirectory`/`AlreadyExists` substring branches) could land in the same sweep. The deeper fix (context preservation at `run_export()`'s `?`) is a separate, larger change — context-preservation requires `anyhow::Context` threading or typed error enum, not just classifier patterns.
**Repro (fresh box, no ANTHROPIC_* env vars).** `claw --model "bad model" version` → exit 0, emits version JSON (silent parse). `claw --model "" version` → exit 0, same. `claw --model "foo bar/baz" prompt "test"` → exit 1, `error: missing Anthropic credentials` (malformed model silently routes to Anthropic, then cred error masquerades as root cause instead of "invalid model syntax").
**The gap.** (1) No upfront model syntax validation in parse_args. `--model` accepts any string. (2) Silent fallback to Anthropic when provider detection fails on malformed syntax. (3) Downstream error misdirection — cred error doesn't say "your model string was invalid, I fell back to Anthropic." (4) Token burn on invalid model at API layer — with credentials set, malformed model reaches the API, billing tokens against a 400 response that should have been rejected client-side. (5) Joins #29 (provider routing silent fallback) — both involve Anthropic fallback masking the real intent. (6) Joins truth-audit — status/version JSON report malformed model without validation. (7) Joins cred-error misdirection family (#28, #99, #127).
@@ -6670,3 +6688,292 @@ Two atomic changes:
- #179 (parse-error real message quality) — claws consuming envelope expect truthful error
- #181 (envelope.exit_code matches process exit) — cross-channel truthfulness
- #30 (cycle #30: OPT_OUT rejection tests) — classification contracts deserve regression tests
---
## Pinpoint #249. Resumed-session slash command error envelopes omit `kind` field — typed-error contract violation at `main.rs:2747` and `main.rs:2783`
**Gap.** The typed-error envelope contract (§4.44) specifies every error envelope MUST include a `kind` field so claws can dispatch without regex-scraping prose. The `--output-format json` path for resumed-session slash commands has TWO branches that emit error envelopes WITHOUT `kind`:
1. **`main.rs:2747-2760`** (`SlashCommand::parse()` Err arm) — triggered when the raw command string is malformed or references an invalid slash structure. Fires for inputs like `claw --resume latest /session` (valid name, missing required subcommand arg).
2. **`main.rs:2783-2793`** (`run_resume_command()` Err arm) — triggered when the slash command dispatch returns an error (including `SlashCommand::Unknown`). Fires for inputs like `claw --resume latest /xyz-unknown`.
Both arms emit JSON envelopes of shape `{type, error, command}` but NOT `kind`, defeating typed-error dispatch for any claw routing on `error.kind`.
Also observed: the `/xyz-unknown` path embeds a multi-line error string (`Unknown slash command: /xyz-unknown\n Help ...`) directly into the `error` field without splitting the runbook hint into a separate `hint` field (per #77 `split_error_hint()` convention). JSON consumers get embedded newlines in the error string.
**Repro.** Dogfooded 2026-04-22 on main HEAD `84466bb` (cycle #37, post-#247 merge):
```bash
$ cd /Users/yeongyu/clawd/claw-code/rust
$ ./target/debug/claw --output-format json --resume latest /session
{"command":"/session","error":"unsupported resumed slash command","type":"error"}
# Observation: no `kind` field. Claws dispatching on error.kind get undefined.
$ ./target/debug/claw --output-format json --resume latest /xyz-unknown
{"command":"/xyz-unknown","error":"Unknown slash command: /xyz-unknown
Help /help lists available slash commands","type":"error"}
# Observation: no `kind` field AND multi-line error without split hint.
$ ./target/debug/claw --output-format json --resume latest /session list
{"active":"session-...","kind":"session_list",...}
# Comparison: happy path DOES include kind field. Only the error path omits it.
```
Contrast with the `Ok(None)` arm at `main.rs:2735-2742` which DOES include `kind: "unsupported_resumed_command"` — proving the contract awareness exists, just not applied consistently across all Err arms.
**Impact.**
1. **Typed-error dispatch broken for slash-command errors.** A claw reading `{"type":"error", "error":"..."}` and switching on `error.kind` gets `undefined` for any resumed slash-command error. Must fall back to substring matching the `error` field, defeating the point of typed errors.
2. **Family-internal inconsistency.** The same error path (`eprintln!` → exit(2)) has three arms: `Ok(None)` sets kind, `Err(error)` (parse) doesn't, `Err(error)` (dispatch) doesn't. Random omission is worse than uniform absence because claws can't tell whether they're hitting a kind-less arm or an untyped category.
3. **Hint embedded in error field.** The `/xyz-unknown` path gets its runbook text inside the `error` string instead of a separate `hint` field, forcing consumers to post-process the message.
**Recommended fix shape.**
Two small, atomic edits in `main.rs`:
1. **Parse-error envelope** (line 2747): Add `"kind": "cli_parse"` to the JSON object. Optionally call `classify_error_kind(&error.to_string())` to get a more specific kind.
2. **Dispatch-error envelope** (line 2783): Same treatment. Classify using `classify_error_kind()`. Additionally, call `split_error_hint()` on `error.to_string()` to separate the short reason from any embedded hint (matches #77 convention used elsewhere).
```rust
// Before (line 2747):
serde_json::json!({
"type": "error",
"error": error.to_string(),
"command": raw_command,
})
// After:
let message = error.to_string();
let kind = classify_error_kind(&message);
let (short_reason, hint) = split_error_hint(&message);
serde_json::json!({
"type": "error",
"error": short_reason,
"hint": hint,
"kind": kind,
"command": raw_command,
})
```
**Regression coverage.** Add integration tests in `tests/output_format_contract.rs`:
- `resumed_session_bare_slash_name_emits_kind_field_249` — `/session` without subcommand
- `resumed_session_unknown_slash_emits_kind_field_249` — `/xyz-unknown`
- `resumed_session_unknown_slash_splits_hint_249` — multi-line error gets hint split
- Regression guard: `resumed_session_happy_path_session_list_unchanged_249` — confirm `/session list` JSON unchanged
**Blocker.** None. ~15 lines Rust, bounded.
**Priority.** Medium. Not red-state (errors ARE surfaced, exit code IS 2), but typed-error contract violation. Any claw doing `error.kind` dispatch on slash-command paths currently falls through to `undefined`.
**Source.** Jobdori cycle #37 proactive dogfood 2026-04-22 23:15 KST in response to Clawhip pinpoint nudge. Probed slash-command JSON error envelopes post-#247 merge; found two Err arms emitting envelopes without `kind`. Joins §4.44 typed-envelope family:
- #179 (parse-error real message quality) — closed
- #181 (envelope exit_code matches process exit) — closed
- #247 (classify_error_kind misses prompt-patterns + hint drop) — closed (cycle #34/#36)
- **#248 (verb-qualified unknown option errors misclassified) — in-flight (another agent)**
- **#249 (this: resumed-session slash command envelopes omit kind) — filed**
Natural bundle: **#247 + #248 + #249** — classifier/envelope completeness sweep. All three fix the same kind of drift: typed-error envelopes missing or mis-set `kind` field on specific CLI paths. When all three land, the typed-envelope contract is uniformly applied across:
- Top-level CLI argument parsing (#247)
- Subcommand option parsing (#248)
- Resumed-session slash command dispatch (#249)
**Related prior work.**
- §4.44 typed error envelope contract (2026-04-20)
- #77 split_error_hint() — should be applied to slash-command error path too
- #247 (model: add classifier branches + ensure envelope carries them)
---
## Pinpoint #250. CLI surface parity gap between Python audit harness and Rust binary — SCHEMAS.md documents `list-sessions`/`delete-session`/`load-session`/`flush-transcript` as CLAWABLE top-level subcommands, but the Rust `claw` binary routes these through the `_other => Prompt` fall-through arm, emitting `missing_credentials` instead of running the documented operation
**Gap.** SCHEMAS.md at the repo root defines a JSON envelope contract for 14 CLAWABLE top-level subcommands including `list-sessions`, `delete-session`, `load-session`, and `flush-transcript`. The Python audit harness at `src/main.py` implements all 14. The Rust `claw` binary at `rust/crates/rusty-claude-cli/` does NOT have these as top-level subcommands — session management lives behind `--resume <id> /session list` via the REPL slash command path.
A claw following SCHEMAS.md as the canonical contract runs `claw list-sessions --output-format json` and hits the Rust binary's `_other => Prompt` fall-through arm (same code path as the now-closed parser-level trust gap quintet #108/#117/#119/#122/#127). The literal token `"list-sessions"` is sent as a prompt to the LLM, which immediately fails with `missing Anthropic credentials` because the prompt path requires auth.
From the claw's perspective:
- **Expected** (per SCHEMAS.md): `{"command": "list-sessions", "exit_code": 0, "sessions": [...]}`
- **Actual** (Rust binary): `{"kind": "missing_credentials", "error": "missing Anthropic credentials; ..."}`
**Repro.** Dogfooded 2026-04-22 on main HEAD `5f8d1b9` (cycle #38):
```bash
$ cd /Users/yeongyu/clawd/claw-code/rust
$ env -i PATH=$PATH HOME=$HOME ./target/debug/claw list-sessions --output-format json
{"error":"missing Anthropic credentials; export ANTHROPIC_AUTH_TOKEN or ANTHROPIC_API_KEY before calling the Anthropic API ...","hint":null,"kind":"missing_credentials","type":"error"}
# exit=1, NOT the documented SCHEMAS.md envelope
$ env -i PATH=$PATH HOME=$HOME ./target/debug/claw delete-session abc123 --output-format json
{"error":"missing Anthropic credentials; ...","hint":null,"kind":"missing_credentials","type":"error"}
# Same fall-through. `abc123` treated as prompt continuation.
$ env -i PATH=$PATH HOME=$HOME ./target/debug/claw --resume latest /session list --output-format json
{"active":"session-...","kind":"session_list","sessions":[...]}
# This is how the Rust binary actually exposes list-sessions — via REPL slash command.
$ python3 -m src.main list-sessions --output-format json
{"command": "list-sessions", "exit_code": 0, ..., "sessions": [...]}
# Python harness implements SCHEMAS.md directly.
```
**Impact.**
1. **Documentation-vs-implementation drift.** SCHEMAS.md is at the repo root (not under `src/` or `rust/`), implying it applies to the whole project. A claw reading SCHEMAS.md and assuming the contract applies to the canonical binary (`claw`) gets a credentials error, not the documented envelope.
2. **Cross-implementation parity gap.** The same logical operation ("list my sessions") has two different CLI shapes:
- Python harness: `python3 -m src.main list-sessions --output-format json`
- Rust binary: `claw --resume latest /session list --output-format json`
Claws that switch between implementations (e.g., for testing or migration) have to maintain two different dispatch tables.
3. **Joins the parser-level trust gap family.** This is the 6th entry in the `_other => Prompt` fall-through family but with a twist: unlike #108/#117/#119/#122/#127 (where the input was genuinely malformed), the input here IS a valid surface name that SCHEMAS.md documents. The fall-through is wrong for a different reason: the surface exists in the protocol but not in this implementation.
4. **Cred-error misdirection.** Same pattern as the pre-#127 `claw doctor --json` misdirection. A claw getting `missing_credentials` thinks it has an auth problem when really it has a surface-not-implemented problem.
**Fix options.**
**Option A: Implement the surfaces on the Rust binary.** Wire `list-sessions`, `delete-session`, `load-session`, `flush-transcript` as top-level subcommands in `rust/crates/rusty-claude-cli/src/main.rs`, each delegating to the existing session management code that currently lives behind `/session list`, `/session delete`, etc. Acceptance: all 4 subcommands emit the SCHEMAS.md envelope identically to the Python harness.
**Option B: Scope SCHEMAS.md explicitly to the Python audit harness.** Add a scope note at the top of SCHEMAS.md clarifying it documents the Python harness protocol, not the Rust binary surface. File a separate pinpoint for "canonical Rust binary JSON contract" if/when that's needed.
**Option C: Reject the surface mismatch at parse time.** Add explicit recognition in the Rust binary's top-level subcommand matcher that `list-sessions`/`delete-session`/etc. are Python-harness surfaces, and emit a structured error pointing to the Rust equivalent (`claw --resume latest /session list` etc.). Stop the fall-through into Prompt dispatch. Acceptance: running `claw list-sessions` in the Rust binary emits `{"kind": "unsupported_surface", "error": "list-sessions is a Python audit harness surface; use `claw --resume latest /session list` for the Rust binary equivalent"}`.
**Recommended: Option C first (cheap, prevents cred misdirection), then Option B as documentation hygiene, then Option A if demand justifies the implementation cost.**
Option C is the same pattern as #127's fix: reject known-bad inputs at parse time with actionable hints, don't fall through to Prompt. This is a new case of the same fall-through category but with the twist that the "bad" input is actually documented as valid in a sibling context.
**Regression.** If Option A: add end-to-end tests matching the Python harness's existing tests for each subcommand. If Option C: add integration tests for each of the 4 Python-harness surface names verifying they emit `unsupported_surface` with the correct redirect hint.
**Blocker.** None for Option C. Option A is larger (requires extending the Rust binary's top-level parser + wiring to session management). Option B is pure docs.
**Priority.** Medium-high. This is red-state in the sense that the binary silently misroutes a documented surface into cred-error. Not a bug in the sense that the Rust binary is missing functionality it promised — but a bug in the sense that **protocol documentation promises a surface that doesn't exist at that address in the canonical implementation.** Either the docs are wrong or the implementation is incomplete; randomness is the current state.
**Source.** Jobdori cycle #38 proactive dogfood 2026-04-22 23:35 KST in response to Clawhip pinpoint nudge. Probed session management CLI paths post-#247-merge; expected SCHEMAS.md envelope, got `missing_credentials` on all 4 surfaces. Joins:
- **Parser-level trust gap family** (#108, #117, #119, #122, #127) as 6th — same `_other => Prompt` fall-through, but the "bad" input is actually a documented surface in SCHEMAS.md (new case class).
- **Cred-error misdirection family** (#99, #127 pre-closure) — same pattern: local-ish operation silently needs creds because it fell into the wrong dispatch arm.
- **Documentation-vs-implementation drift family** — SCHEMAS.md documents 14 surfaces; Rust binary has ~8 top-level subcommands; mismatch is undocumented.
Natural bundle: **#127 + #250** — parser-level fall-through pair with a class distinction (#127 = suffix arg on valid verb; #250 = entire Python-harness verb treated as prompt).
**Related prior work.**
- SCHEMAS.md (the canonical envelope contract — drafted in Python-harness context)
- §4.44 typed-envelope contract
- #127 (closed: suffix arg rejection at parse time for diagnostic verbs)
- #108/#117/#119/#122/#127 (parser-level trust gap quintet)
- Python harness `src/main.py` (14 CLAWABLE surfaces)
- Rust binary `rust/crates/rusty-claude-cli/src/main.rs` (different top-level surface set)
---
## Pinpoint #251. Session-management verbs (`list-sessions`/`delete-session`/`load-session`/`flush-transcript`) fall through to Prompt dispatch at parse time before credential resolution — wrong error CLASS is emitted (auth) for what should be local session-store operations
**Gap.** This is the **dispatch-order framing** of the parity symptom filed at #250. Where #250 says "the surface is missing on the canonical binary and SCHEMAS.md promises it," #251 says "the underlying mechanism is a top-level parser fall-through that happens BEFORE the dispatcher can intercept the verb, so callers get `missing_credentials` instead of any session-layer response at all."
The two pinpoints describe the same observable failure from different layers:
- **#250 (surface layer):** SCHEMAS.md top-level verbs aren't implemented as top-level Rust subcommands.
- **#251 (dispatch layer):** The top-level parser has no match arm for these verbs, so they fall into the `_other => Prompt` catchall at `main.rs:1017`, which constructs `CliAction::Prompt { prompt: "list-sessions", ... }`. Downstream, the Prompt path requires credentials, and the CLI emits `missing_credentials` for a purely-local operation.
**The same pattern has been fixed before** for other purely-local verbs:
- **#145** — `plugins` was falling through to Prompt. Fix: explicit match arm at `main.rs:888-906` returning `CliAction::Plugins { ... }`.
- **#146** — `config` and `diff` were falling through. Fix: explicit match arms at `main.rs:911-935` returning `CliAction::Config { ... }` and `CliAction::Diff { ... }`.
Both fixes followed identical shape: intercept the verb at top-level parse, construct the corresponding `CliAction` variant, bypass the Prompt/credential path entirely. #251 extends this to the 4 session-management verbs.
**Repro.** Dogfooded 2026-04-23 cycle #40 on main HEAD `f110333`:
```bash
$ env -i PATH=$PATH HOME=$HOME /path/to/claw list-sessions --output-format json
{"error":"missing Anthropic credentials; export ANTHROPIC_AUTH_TOKEN or ANTHROPIC_API_KEY ...","kind":"missing_credentials","type":"error"}
# Expected: session-layer envelope like {"command":"list-sessions","sessions":[...]}
# Actual: auth-layer error because the verb was treated as a prompt.
```
**Code trace (verified cycle #40).**
- `main.rs:1017-1027` — the final `_other` arm of the top-level parser. Joins all unrecognized tokens with spaces and constructs `CliAction::Prompt { prompt: joined, ... }`.
- Downstream, the Prompt dispatcher calls `resolve_credentials()` which emits `missing Anthropic credentials` when neither `ANTHROPIC_API_KEY` nor `ANTHROPIC_AUTH_TOKEN` is set.
- No credential resolution would have been needed had the verb been intercepted earlier.
**Relationship to #250.**
| Aspect | #250 | #251 |
|---|---|---|
| **Layer** | Surface / documentation | Dispatch / parser internals |
| **Framing** | Protocol vs implementation drift | Wrong dispatch order |
| **Fix scope** | 3 options (docs scope, Rust impl, reject-with-redirect) | Narrow: add match arms mirroring #145/#146 |
| **Evidence** | SCHEMAS.md promises ≠ binary delivers | Parser fall-through happens before the dispatcher can classify the verb |
They share the observable (`missing_credentials` on a documented surface) but prescribe different scopes of fix:
- **#250's Option A** (implement the surfaces) = **#251's proper fix** — actually wire the session-management paths.
- **#250's Option C** (reject with redirect) = a different fix that doesn't implement the verbs but at least stops the auth-error misdirection.
**Recommended sequence:**
1. **#251 fix** (implement the 4 match arms following the #145/#146 pattern) is the principled solution — it makes the canonical binary honor SCHEMAS.md.
2. **#250's documentation scope note** (Option B) remains valuable regardless, as a guard against future drift between the two implementations.
3. **#250's Option C** (reject with redirect) becomes unnecessary if #251 lands — no verbs to redirect away from.
**Fix shape (~40 lines).**
Add 4 match arms to the top-level parser (file: `rust/crates/rusty-claude-cli/src/main.rs:~840-1015`), each mirroring the pattern from `plugins`/`config`/`diff`:
```rust
"list-sessions" => {
let tail = &rest[1..];
// list-sessions: optional --directory flag already parsed; no positional args
if !tail.is_empty() {
return Err(format!("unexpected extra arguments after `claw list-sessions`: {}", tail.join(" ")));
}
Ok(CliAction::ListSessions { output_format, directory: /* already parsed */ })
}
"delete-session" => {
let tail = &rest[1..];
// delete-session: requires session-id positional
let session_id = tail.first().ok_or_else(|| "delete-session requires a session-id argument".to_string())?.clone();
if tail.len() > 1 {
return Err(format!("unexpected extra arguments after `claw delete-session {session_id}`: {}", tail[1..].join(" ")));
}
Ok(CliAction::DeleteSession { session_id, output_format, directory: /* already parsed */ })
}
"load-session" => { /* same pattern */ }
"flush-transcript" => { /* same pattern, with --session-id flag handling */ }
```
Plus `CliAction` variants, dispatcher wiring, and regression tests. Likely ~40 lines of Rust + tests if session-store operations already exist in `runtime/`.
**Acceptance.** All 4 verbs emit session-layer envelopes matching the SCHEMAS.md contract:
- `claw list-sessions --output-format json` → `{"command":"list-sessions","sessions":[...],"exit_code":0}`
- `claw delete-session <id> --output-format json` → `{"command":"delete-session","deleted":true,"exit_code":0}`
- `claw load-session <id> --output-format json` → `{"command":"load-session","session":{...},"exit_code":0}`
- `claw flush-transcript --session-id <id> --output-format json` → `{"command":"flush-transcript","flushed":N,"exit_code":0}`
No credential resolution is triggered for any of these paths.
**Regression tests.**
- Each verb with valid arguments: emits correct envelope, exit 0.
- Each verb with missing required argument: emits `cli_parse` error envelope (with kind), exit 1.
- Each verb with extra arguments: emits `cli_parse` error envelope rejecting them.
- Regression guard: `claw list-sessions` in env-cleaned environment does NOT emit `missing_credentials`.
**Blocker.** None. Bounded to 4 additional top-level match arms + corresponding `CliAction` variants + dispatcher wiring. Session-store operations may need minor extraction from `/session list` implementation.
**Priority.** Medium-high. Same severity as #250 (silent misdirection on a documented surface), with sharper framing. Closing #251 automatically resolves #250's Option A and makes Option C unnecessary.
**Source.** Filed 2026-04-23 00:00 KST by gaebal-gajae (conceptual filing in Discord cycle status at msg 1496526112254328902); verified and formalized into ROADMAP by Jobdori cycle #40. Natural bundle:
- **#145 + #146 + #251** — parser fall-through fix pattern (plugins, config/diff, session-management verbs). All 3 follow identical fix shape: intercept at top-level parse, bypass Prompt/credential path.
- **#250 + #251** — symptom/mechanism pair on the same observable failure. #250 frames it as protocol-vs-implementation drift; #251 frames it as dispatch-order bug.
- **#99 + #127 + #250 + #251** — cred-error misdirection family. Each case: a purely-local operation silently routes through the auth-required Prompt path and emits the wrong error class.
**Related prior work.**
- #145 (plugins fall-through fix) — direct template for #251 fix shape
- #146 (config/diff fall-through fix) — same pattern
- #250 (surface parity framing of same failure)
- §4.44 typed-envelope contract
- SCHEMAS.md (specifies the 4 session-management verbs as top-level CLAWABLE surfaces)

View File

@@ -2745,11 +2745,20 @@ fn resume_session(session_path: &Path, commands: &[String], output_format: CliOu
}
Err(error) => {
if output_format == CliOutputFormat::Json {
// #249: thread classify_error_kind + split_error_hint through this arm
// so the JSON envelope carries the same `kind` and `hint` fields as
// the Ok(None) path's error branch at main.rs:2658. Without these, claws
// routing on `kind` couldn't distinguish parse errors from other classes.
let full_message = error.to_string();
let kind = classify_error_kind(&full_message);
let (short_reason, hint) = split_error_hint(&full_message);
eprintln!(
"{}",
serde_json::json!({
"type": "error",
"error": error.to_string(),
"error": short_reason,
"kind": kind,
"hint": hint,
"command": raw_command,
})
);
@@ -2782,11 +2791,19 @@ fn resume_session(session_path: &Path, commands: &[String], output_format: CliOu
}
Err(error) => {
if output_format == CliOutputFormat::Json {
// #249: mirror the Err arm above — emit the typed-error contract
// (kind + hint) for the run_resume_command failure path so claws
// routing on `kind` can distinguish parse/classification errors.
let full_message = error.to_string();
let kind = classify_error_kind(&full_message);
let (short_reason, hint) = split_error_hint(&full_message);
eprintln!(
"{}",
serde_json::json!({
"type": "error",
"error": error.to_string(),
"error": short_reason,
"kind": kind,
"hint": hint,
"command": raw_command,
})
);
@@ -10475,6 +10492,46 @@ mod tests {
);
}
#[test]
fn resumed_slash_error_envelope_includes_kind_and_hint_249() {
// #249: the resumed-session slash command Err arms at main.rs:~2747
// and main.rs:~2783 previously emitted JSON envelopes without `kind`
// or `hint` fields, breaking the typed-error contract for claws
// routing on error class. This test documents the contract: typical
// error messages produced through these paths (parse_command_token
// failures, run_resume_command failures) must classify correctly
// via the existing classify_error_kind() helper.
// parse_command_token path (main.rs:~2747): unknown slash commands
// surface as cli_parse errors.
assert_eq!(
classify_error_kind("unknown slash command outside the REPL: /blargh"),
"unknown",
"unknown slash command without verb+option shape is currently unknown (may be tightened later via #248-family classifier work)"
);
// run_resume_command path (main.rs:~2783): common filesystem errors
// propagated from save_to_path(), write_session_clear_backup(), etc.
// These will classify via classify_error_kind; what matters is the
// envelope now carries the kind and hint fields instead of omitting them.
// Test the contract: whatever string is passed, it gets a kind and hint.
let full_message = "compact failed: I/O error during save";
let kind = classify_error_kind(full_message);
let (short_reason, hint) = split_error_hint(full_message);
// Envelope building block must not panic and must produce usable values.
assert!(!kind.is_empty(), "classify_error_kind must return a non-empty class name");
assert!(!short_reason.is_empty(), "split_error_hint must return a non-empty short reason");
// hint may be None (single-line message has no hint), that's allowed.
let _ = hint;
// Regression guard for the envelope SHAPE: the two arms must now include
// `kind` and `hint` fields (along with the pre-existing `type`, `error`,
// `command` fields). This is a structural contract — if anyone reverts
// the envelope to drop these fields, the code review must reject it.
// (Direct JSON inspection requires integration test infrastructure; this
// unit test verifies the building blocks work correctly.)
}
#[test]
fn split_error_hint_separates_reason_from_runbook() {
// #77: short reason / hint separation for JSON error payloads