From 93e979261efec8627c16dc5040ddb2526b86064e Mon Sep 17 00:00:00 2001 From: Yeachan-Heo Date: Sun, 5 Apr 2026 18:12:13 +0000 Subject: [PATCH] Record session state classification gap from dogfood --- ROADMAP.md | 1 + 1 file changed, 1 insertion(+) diff --git a/ROADMAP.md b/ROADMAP.md index e48676d..839ccfd 100644 --- a/ROADMAP.md +++ b/ROADMAP.md @@ -306,6 +306,7 @@ Priority order: P0 = blocks CI/green state, P1 = blocks integration wiring, P2 = 17. **Orphaned module integration audit** — `session_control` is `pub mod` exported from `runtime` but has zero consumers across the entire workspace (no import, no call site outside its own file). `trust_resolver` types are re-exported from `lib.rs` but never instantiated outside unit tests. These modules implement core clawability contracts (session management, trust resolution) that are structurally dead — built but not wired into the CLI or tools crate. **Action:** audit all `pub mod` / `pub use` exports from `runtime` for actual call sites; either wire orphaned modules into the real execution path or demote to `pub(crate)` / `cfg(test)` to prevent false clawability surface. 18. **Context-window preflight gap** — claw-code auto-compacts only after cumulative input crosses a static `100_000`-token threshold, while provider requests derive `max_tokens` from a naive model-name heuristic (`opus` => 32k, else 64k) and do not appear to preflight `estimated_prompt_tokens + requested_output_tokens` against the selected model’s actual context window. Result: giant sessions can be sent upstream and fail hard with provider-side `input_exceeds_context_by_*` errors instead of local preflight compaction/rejection. **Action:** add a model-context registry + request-size preflight before provider call; if projected request exceeds context, emit a structured `context_window_blocked` event and auto-compact or force `/compact` before retry. 19. **Subcommand help falls through into runtime/API path** — direct dogfood shows `./target/debug/claw doctor --help` and `./target/debug/claw status --help` do not render local subcommand help. Instead they enter the request path, show `🦀 Thinking...`, then fail with `api returned 500 ... auth_unavailable: no auth available`. Help/usage surfaces must be pure local parsing and never require auth or provider reachability. **Action:** fix argv dispatch so ` --help` is intercepted before runtime startup/API client initialization; add regression tests for `doctor --help`, `status --help`, and similar local-info commands. +20. **Session state classification gap (working vs blocked vs finished vs truly stale)** — dogfooding with 14 parallel tmux/OMX lanes exposed that text-idle stale detection is far too coarse. Sessions were repeatedly flagged as stale even when they were already **finished/reportable** (P0.9, P0.10, P2.18, P2.19), **working but quiet** (doc/branding/audit passes), or **blocked on a specific recoverable state** (background terminal still running, cherry-pick conflict, MCP startup noise, transport interruption after partial progress). **Action:** add explicit machine states above prose scraping such as `working`, `blocked_background_job`, `blocked_merge_conflict`, `degraded_mcp`, `interrupted_transport`, `finished_pending_report`, `finished_cleanable`, and `truly_idle`; update clawhip/session monitoring so quiet work is not paged as stale and completed sessions can auto-report + auto-clean. **P3 — Swarm efficiency** 13. Swarm branch-lock protocol — detect same-module/same-branch collision before parallel workers drift into duplicate implementation