The runtime already tracked rough token estimates for compaction, but provider-bound
requests still relied on naive model output limits and could be sent upstream even
when the selected model could not fit the estimated prompt plus requested output.
This adds a small model token/context registry in the API layer, estimates request
size from the serialized prompt payload, and fails locally with a dedicated
context-window error before Anthropic or xAI calls are made. Focused integration
coverage asserts the preflight fires before any HTTP request leaves the process.
Constraint: Keep the first pass minimal and reusable across both Anthropic and OpenAI-compatible providers
Rejected: Auto-compact-and-retry in the same patch | broader control-flow change than the requested minimal preflight
Confidence: medium
Scope-risk: narrow
Reversibility: clean
Directive: Expand the model registry before enabling preflight for additional providers or aliases
Tested: cargo build -p api -p tools -p rusty-claude-cli; cargo test -p api
Not-tested: End-to-end CLI auto-compaction or retry behavior after a local context_window_blocked failure
Extended thinking needed to travel end-to-end through the API,
runtime, and CLI so the client can request a thinking budget,
preserve streamed reasoning blocks, and present them in a
collapsed text-first form. The implementation keeps thinking
strictly opt-in, adds a session-local toggle, and reuses the
existing flag/slash-command/reporting surfaces instead of
introducing a new UI layer.
Constraint: Existing non-thinking text/tool flows had to remain backward compatible by default
Constraint: Terminal UX needed a lightweight collapsed representation rather than an interactive TUI widget
Rejected: Heuristic CLI-only parsing of reasoning text | brittle against structured stream payloads
Rejected: Expanded raw thinking output by default | too noisy for normal assistant responses
Confidence: medium
Scope-risk: moderate
Reversibility: clean
Directive: Keep thinking blocks structurally separate from answer text unless the upstream API contract changes
Tested: cargo fmt --all; cargo clippy --workspace --all-targets -- -D warnings; cargo test -q
Not-tested: Live upstream thinking payloads against the production API contract
Extended thinking needed to travel end-to-end through the API,
runtime, and CLI so the client can request a thinking budget,
preserve streamed reasoning blocks, and present them in a
collapsed text-first form. The implementation keeps thinking
strictly opt-in, adds a session-local toggle, and reuses the
existing flag/slash-command/reporting surfaces instead of
introducing a new UI layer.
Constraint: Existing non-thinking text/tool flows had to remain backward compatible by default
Constraint: Terminal UX needed a lightweight collapsed representation rather than an interactive TUI widget
Rejected: Heuristic CLI-only parsing of reasoning text | brittle against structured stream payloads
Rejected: Expanded raw thinking output by default | too noisy for normal assistant responses
Confidence: medium
Scope-risk: moderate
Reversibility: clean
Directive: Keep thinking blocks structurally separate from answer text unless the upstream API contract changes
Tested: cargo fmt --all; cargo clippy --workspace --all-targets -- -D warnings; cargo test -q
Not-tested: Live upstream thinking payloads against the production API contract
The Rust CLI now recognizes explicit local image references in prompt text,
encodes supported image files as base64, and serializes mixed text/image
content blocks for the API. The request conversion path was kept narrow so
existing runtime/session structures remain stable while prompt mode and user
text conversion gain multimodal support.
Constraint: Must support PNG, JPG/JPEG, GIF, and WebP without adding broad runtime abstractions
Constraint: Existing text-only prompt behavior and API tool flows must keep working unchanged
Rejected: Add only explicit --image CLI flags | does not satisfy auto-detect image refs in prompt text
Rejected: Persist native image blocks in runtime session model | broader refactor than needed for prompt support
Confidence: high
Scope-risk: moderate
Reversibility: clean
Directive: Keep image parsing scoped to outbound user prompt adaptation unless session persistence truly needs multimodal history
Tested: cargo fmt --all; cargo clippy --workspace --all-targets -- -D warnings; cargo test --workspace
Not-tested: Live remote multimodal request against Anthropic API
The Rust CLI now recognizes explicit local image references in prompt text,
encodes supported image files as base64, and serializes mixed text/image
content blocks for the API. The request conversion path was kept narrow so
existing runtime/session structures remain stable while prompt mode and user
text conversion gain multimodal support.
Constraint: Must support PNG, JPG/JPEG, GIF, and WebP without adding broad runtime abstractions
Constraint: Existing text-only prompt behavior and API tool flows must keep working unchanged
Rejected: Add only explicit --image CLI flags | does not satisfy auto-detect image refs in prompt text
Rejected: Persist native image blocks in runtime session model | broader refactor than needed for prompt support
Confidence: high
Scope-risk: moderate
Reversibility: clean
Directive: Keep image parsing scoped to outbound user prompt adaptation unless session persistence truly needs multimodal history
Tested: cargo fmt --all; cargo clippy --workspace --all-targets -- -D warnings; cargo test --workspace
Not-tested: Live remote multimodal request against Anthropic API