roadmap: #260 filed

This commit is contained in:
Jobdori
2026-04-26 12:06:57 +09:00
committed by YeonGyu-Kim
parent fe10cb39c1
commit 971c1a808e

View File

@@ -16851,3 +16851,15 @@ Dogfooded 2026-04-26 12:00 KST after cycle #396: a dogfood status report posted
Concrete failure mode: multi-agent dogfood coordination can regress to outdated phase summaries even while the branch is actively moving. Operators then have to manually cross-check `git log`, ROADMAP markers, and chat history to decide whether the report is actionable. This is distinct from #253 compact state-vector budgeting: #253 bounds context size; #259 requires freshness/provenance assertions before publishing a compact status.
Required fix shape: every dogfood status report should include machine-checked provenance fields (`generated_at`, `repo`, `branch`, `head`, `head_timestamp`, `roadmap_last_pinpoint`, `git_fetch_time`, `source=git+ROADMAP`, `staleness_seconds`) and refuse/label reports when the source snapshot is older than a small threshold. `claw dogfood status --compact` should fetch, parse latest ROADMAP pinpoint id, compare against local chat-memory claims, and emit `STALE_STATUS_SOURCE` if they disagree. Acceptance: a report cannot claim “no new commits/new pinpoints” while `origin/feat/jobdori-168c-emission-routing` contains newer commits/pinpoints than its own provenance head. **Status:** Open. Filed as ROADMAP-only dogfood pinpoint from the 2026-04-26 03:00 UTC nudge; live branch was verified before filing and pushed on top of #258.
## Pinpoint #260`--compact --output-format json` envelope silently strips six observability fields (auto_compaction, iterations, tool_uses, tool_results, prompt_cache_events, estimated_cost) that the non-compact JSON envelope emits, with no diagnostic, no marker delta beyond `compact: true`, and no documentation that the strip occurs
Dogfooded 2026-04-26 12:05 KST on `feat/jobdori-168c-emission-routing` at HEAD `1daf636` (post-#259 fast-forward verification). The dispatch in `LiveCli::run_turn_with_output` (`rust/crates/rusty-claude-cli/src/main.rs:4637-4650`) routes `CliOutputFormat::Json if compact` to `run_prompt_compact_json` (`main.rs:4665-4688`) and `CliOutputFormat::Json` (no compact) to `run_prompt_json` (`main.rs:4690-4729`). Both paths receive the SAME `runtime::TurnSummary` from `runtime.run_turn(...)`, but the two envelopes serialize wildly different field sets. `run_prompt_json` emits eleven top-level keys: `message`, `model`, `iterations`, `auto_compaction`, `tool_uses`, `tool_results`, `prompt_cache_events`, `usage`, `estimated_cost`. `run_prompt_compact_json` emits four: `message`, `compact: true`, `model`, `usage`. **Six observability-critical fields are dropped silently** — most notably `auto_compaction` (the structured signal that the runtime auto-compacted the session mid-turn, removing N messages from history) and `iterations` (turn-loop iteration count, the only non-summary signal of how the agent reached the final assistant text). The `compact: true` marker is the ONLY diff a downstream JSON consumer can observe; nothing in the envelope, the help text, or any structured-error stream tells the operator that adding `--compact` discarded the auto-compaction event, the iteration count, the tool-use trace, the tool-result trace, the prompt-cache events, and the cost estimate. Operators who script `claw -p "x" --compact --output-format json | jq` to keep wire size small unknowingly lose the only signal that auto-compaction fired, and the only way to recover it is to remove `--compact` and re-run the prompt.
Gap. This is **silent-strip-on-response-envelope at the CLI output layer**, distinct from #136 (which only verified that `run_prompt_compact_json` exists and emits valid JSON with `compact: true`, never auditing what the envelope drops vs. its non-compact sibling) and distinct from #98 (which audited `--compact` being silently *ignored* outside the prompt-text path; #136 closed that by adding the dispatch arm — but the new envelope itself is the gap). The compact-JSON path was added to honor the flag, but the envelope was hand-coded with a minimal field set that omits exactly the fields a JSON-mode operator most needs (auto_compaction event, iteration count, cost). Worse, `auto_compaction` is the documented mechanism by which #134/#135's session-identity signals propagate — stripping it silently disables that downstream observability for any consumer that scripted around `--compact --output-format json`.
Cluster delta: joins the silent-fallback / silent-drop / silent-strip / silent-coercion sibling-shape cluster, extending it from 8 to 9 members. Distinct from #258 (CLI parse boundary, request-side), distinct from #213/#207/#208 (provider boundary, response-side wire deserialization), distinct from #203 (no streaming auto_compaction event at all). #260 is the FIRST member where the silent-strip happens at the **CLI response-envelope serialization layer** — after the runtime has fully populated the summary, the CLI itself drops the fields when assembling the JSON. Founds the **CLI-response-envelope-silent-strip sub-shape** within the silent-fallback family: the runtime computes the signal correctly; the CLI envelope serializer chooses not to surface it; no diagnostic surfaces the choice. Sibling-shape with #258 in that both extend the silent-fallback cluster at the CLI boundary, but #258 is request-side parse and #260 is response-side serialize — together they bracket the full CLI I/O perimeter for the silent-fallback family. Does NOT found a new top-level cluster (per #253 context-budget discipline preferring extension over founding).
Required fix shape: (a) align `run_prompt_compact_json` envelope so it emits the SAME field set as `run_prompt_json` minus only the fields whose value is genuinely stripped by the compact intent (the documented intent is "strip tool call details; print only the final assistant text" — so dropping `tool_uses`/`tool_results` is intentional, but dropping `auto_compaction`/`iterations`/`prompt_cache_events`/`estimated_cost` is not); concretely, add `iterations`, `auto_compaction`, `prompt_cache_events`, and `estimated_cost` to the compact-JSON envelope; (b) document the field-set delta in `--help` for `--compact` ("in JSON mode, strips `tool_uses` and `tool_results`; preserves `auto_compaction`, `iterations`, `prompt_cache_events`, `usage`, `estimated_cost`"); (c) add a regression test `run_prompt_compact_json_preserves_auto_compaction_signal` that asserts the compact-JSON envelope contains the `auto_compaction` key (null or populated) so future envelope edits cannot silently regress; (d) optionally emit a structured `EnvelopeFieldStrip` event listing dropped fields when `--output-format json` is active so downstream consumers can self-discover what the compact lane drops. Acceptance: `claw -p "x" --compact --output-format json | jq 'keys'` returns at least `["auto_compaction", "compact", "estimated_cost", "iterations", "message", "model", "prompt_cache_events", "usage"]`; the only fields stripped relative to non-compact are the documented `tool_uses`/`tool_results`; a synthetic auto-compaction event surfaces under `--compact` identically to non-compact.
**Status:** Open. No source code changed. Filed 2026-04-26 12:05 KST. Branch: feat/jobdori-168c-emission-routing. HEAD: `1daf636` (post-#259 fast-forward verification onto gaebal-gajae's stale-status-report-provenance pinpoint). Cluster delta: silent-fallback-family extension 8→9 (no new top-level cluster founded, per #253 context-budget discipline). CLI-flag-interaction-silent-precedence sub-shape introduced (response-envelope strip layer, sibling to #258's request-parse layer). Sibling: #98 (silent-flag-no-op class, predecessor at the dispatch layer, closed by #136), #136 (compact-JSON envelope existence, closed without auditing field-set parity), #203 (auto_compaction summary-only, no streaming event — #260 escalates: even the summary signal is dropped under `--compact`), #258 (CLI parse boundary silent-coercion, request-side complement to #260's response-side strip). Does not duplicate #98/#136: those audited dispatch and envelope-existence; #260 audits envelope-content-parity vs. its non-compact sibling — a structurally distinct surface.