From 959bdf849151d8fd4dca7d6d798d23d7c0f05a04 Mon Sep 17 00:00:00 2001
From: YeonGyu-Kim <code.yeon.gyu@gmail.com>
Date: Sat, 25 Apr 2026 22:16:02 +0900
Subject: [PATCH] =?UTF-8?q?roadmap:=20#214=20filed=20=E2=80=94=20ChunkDelt?=
 =?UTF-8?q?a=20and=20ChatMessage=20in=20openai=5Fcompat.rs=20deserialize?=
 =?UTF-8?q?=20only=20content/tool=5Fcalls;=20delta.reasoning=5Fcontent=20(?=
 =?UTF-8?q?sibling=20to=20delta.content,=20the=20canonical=20wire=20field?=
 =?UTF-8?q?=20for=20DeepSeek=20deepseek-reasoner=20/=20Alibaba=20Qwen3-Thi?=
 =?UTF-8?q?nking=20/=20QwQ=20/=20vLLM=20reasoning-parser=20backends)=20is?=
 =?UTF-8?q?=20silently=20discarded=20at=20serde-deserialize=20time=20befor?=
 =?UTF-8?q?e=20any=20handler=20sees=20it;=20non-streaming=20ChatMessage=20?=
 =?UTF-8?q?has=20the=20same=20gap;=20is=5Freasoning=5Fmodel=20classifier?=
 =?UTF-8?q?=20already=20returns=20true=20for=20o1/o3/o4/grok-3-mini/qwen-q?=
 =?UTF-8?q?wq/qwq/*thinking*=20and=20is=20consulted=20at=20line=20901=20to?=
 =?UTF-8?q?=20strip=20request-side=20tuning=20params=20but=20never=20on=20?=
 =?UTF-8?q?the=20response=20side=20to=20opt=20into=20reasoning=5Fcontent?=
 =?UTF-8?q?=20extraction;=20local=20taxonomy=20already=20declares=20Output?=
 =?UTF-8?q?ContentBlock::Thinking=20and=20ContentBlockDelta::ThinkingDelta?=
 =?UTF-8?q?=20and=20the=20Anthropic=20native=20path=20correctly=20emits=20?=
 =?UTF-8?q?both=20with=20full=20test=20coverage=20at=20sse.rs:260,288=20?=
 =?UTF-8?q?=E2=80=94=20the=20openai-compat=20translator=20has=20the=20dest?=
 =?UTF-8?q?ination=20types=20one=20import=20away=20and=20never=20bridges?=
 =?UTF-8?q?=20to=20them=20(Jobdori=20cycle=20#366=20/=20extends=20#168c=20?=
 =?UTF-8?q?emission-routing=20audit=20/=20sibling-shape=20cluster=20grows?=
 =?UTF-8?q?=20to=20thirteen:=20#201/#202/#203/#206/#207/#208/#209/#210/#21?=
 =?UTF-8?q?1/#212/#213/#214=20/=20reasoning-fidelity=20trio:=20#207+#211+#?=
 =?UTF-8?q?214=20/=20wire-format-parity=20cluster:=20#211+#212+#213+#214?=
 =?UTF-8?q?=20/=20external=20validation:=20DeepSeek=20API=20docs,=20vLLM?=
 =?UTF-8?q?=20reasoning-outputs,=20anomalyco/opencode#24124,=20charmbracel?=
 =?UTF-8?q?et/crush,=20simonw/llm,=20Vercel=20AI=20SDK,=20LangChain=20Base?=
 =?UTF-8?q?ChatOpenAI,=20LiteLLM,=20continue.dev#9245)?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

---
 ROADMAP.md | 259 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 259 insertions(+)

diff --git a/ROADMAP.md b/ROADMAP.md
index 16040f0..79ea122 100644
--- a/ROADMAP.md
+++ b/ROADMAP.md
@@ -13960,3 +13960,262 @@ The deeper fix is a unified-registry refactor (`MODEL_PARAM_REQUIREMENTS` table
 **Status:** Open. No code changed. Filed 2026-04-25 21:30 KST. Branch: feat/jobdori-168c-emission-routing. HEAD: c009818. Sibling-shape cluster (silent-fallback / silent-drop / silent-strip / silent-misnomer / silent-shadow / silent-prefix-mismatch / structural-absence / silent-zero-coercion at provider/CLI boundary): #201/#202/#203/#206/#207/#208/#209/#210/#211/#212/#213 — twelve pinpoints, one unified-registry refactor (`MODEL_PARAM_REQUIREMENTS` with five columns: `tuning_params_strip`, `max_output_tokens`, `max_tokens_param_name`, `default_parallel_tool_calls`, `cache_token_wire_shape`) closes them all. Cost-parity cluster: #204 (token emission) + #207 (token preservation) + #209 (cost estimation) + #210 (max_tokens registry parity) + #213 (cache token visibility) — five pinpoints, all openai-compat boundary. Wire-format-parity cluster: #211 (max_tokens parameter name) + #212 (parallel_tool_calls / disable_parallel_tool_use) + #213 (cached_tokens / prompt_cache_hit_tokens). External validation: OpenAI prompt caching docs (https://platform.openai.com/docs/guides/prompt-caching), DeepSeek pricing docs (https://api-docs.deepseek.com/quick_start/pricing), anomalyco/opencode#17223/#17121/#17056/#11995 (active issues on identical pattern), Vercel AI SDK `LanguageModelV1Usage.cachedInputTokens`, charmbracelet/crush usage telemetry, simonw/llm `--show-cached-tokens`, ddhigh.com (2026-03-26) third-party proxy fix with 87% per-request cost reduction — same control surface available across the entire ecosystem, absent only in claw-code.
 
 🪨
+
+## Pinpoint #214 — `ChunkDelta` deserializes only `content` and `tool_calls`; the openai-compat streaming path drops `delta.reasoning_content` entirely, silently discarding chain-of-thought text from DeepSeek `deepseek-reasoner`, Alibaba Qwen3-Thinking, QwQ, and any vLLM-served reasoning backend even though `is_reasoning_model()` already returns true for those families and the local `OutputContentBlock::Thinking`/`ContentBlockDelta::ThinkingDelta` taxonomy fully exists for the Anthropic native path (Jobdori, cycle #366 / extends #168c emission-routing audit / sibling-shape cluster grows to thirteen / completes the openai-compat reasoning-fidelity trio with #211 + #207)
+
+**Observed:** In `rust/crates/api/src/providers/openai_compat.rs:735-741`, the streaming-chunk delta deserialization struct captures exactly two fields:
+
+```rust
+#[derive(Debug, Default, Deserialize)]
+struct ChunkDelta {
+    #[serde(default)]
+    content: Option<String>,
+    #[serde(default, deserialize_with = "deserialize_null_as_empty_vec")]
+    tool_calls: Vec<DeltaToolCall>,
+}
+```
+
+There is no `reasoning_content`, no `reasoning`, no `thinking`, no `chain_of_thought` field, no fallback accessor, no `serde(flatten)` capture into a side-channel `extra: HashMap<String, Value>`. The wire field that DeepSeek's reasoning-model API places at `choices[].delta.reasoning_content` (sibling to `content`, not nested) and that vLLM emits at the same path for any reasoning-tuned backend is silently dropped at serde-deserialize time, before any handler sees it.
+
+The non-streaming response shape has the same gap. `ChatMessage` (lines 686-693) deserializes only `role`, `content`, and `tool_calls`:
+
+```rust
+#[derive(Debug, Deserialize)]
+struct ChatMessage {
+    role: String,
+    #[serde(default)]
+    content: Option<String>,
+    #[serde(default)]
+    tool_calls: Vec<ResponseToolCall>,
+}
+```
+
+No `reasoning_content` here either. A non-streaming `claw prompt --no-stream --model deepseek/deepseek-reasoner "explain X"` returns `MessageResponse.content` with the final answer only — the entire CoT is invisible.
+
+**Repository surface (verified 2026-04-25 21:55 KST):**
+
+```bash
+$ cd ~/clawd/claw-code
+$ grep -rn "reasoning_content\|reasoning:" rust/ src/ tests/ docs/ 2>/dev/null
+# (empty — zero hits anywhere in the codebase)
+
+$ grep -rn "completion_tokens_details\|reasoning_tokens" rust/crates/api/src/ 2>/dev/null
+# (empty)
+
+$ grep -n "Thinking\|ThinkingDelta" rust/crates/api/src/providers/openai_compat.rs
+# rust/crates/api/src/providers/openai_compat.rs:790:        // Alibaba DashScope reasoning variants (QwQ + Qwen3-Thinking family)
+# (one comment line; zero code paths)
+
+$ grep -n "ContentBlock.*Thinking\|OutputContentBlock::Thinking\|ThinkingDelta" rust/crates/api/src/
+# rust/crates/api/src/types.rs:156:    Thinking {
+# rust/crates/api/src/types.rs:162:    RedactedThinking {
+# rust/crates/api/src/types.rs:245:    ThinkingDelta { thinking: String },
+# rust/crates/api/src/sse.rs:260:                    content_block: OutputContentBlock::Thinking {
+# rust/crates/api/src/sse.rs:288:                    delta: ContentBlockDelta::ThinkingDelta {
+
+# Result: the local taxonomy has Thinking content blocks and ThinkingDelta variants,
+# the Anthropic SSE parser at sse.rs:260/288 emits both with test coverage,
+# and the openai-compat path has neither a reader for the wire field nor an emitter
+# for the local event variant. The lane is half-built: declared in types.rs,
+# emitted by anthropic.rs, and structurally absent from openai_compat.rs.
+
+$ grep -n "is_reasoning_model" rust/crates/api/src/providers/openai_compat.rs
+# rust/crates/api/src/providers/openai_compat.rs:780:pub fn is_reasoning_model(model: &str) -> bool {
+# rust/crates/api/src/providers/openai_compat.rs:901:    if !is_reasoning_model(&request.model) {
+
+# is_reasoning_model already classifies o1/o3/o4/grok-3-mini/qwen-qwq/qwq/*thinking*
+# at line 780. The classifier is used at line 901 to strip tuning params for the
+# request side — but the same classifier is never consulted on the response side
+# to opt into reasoning_content extraction. Half-applied taxonomy, same shape as #211.
+```
+
+**Blast radius (verified by `grep -rn "OpenAiCompatClient\|openai_compat::" rust/crates/`):**
+
+- Every `claw prompt` against any of these wire model ids streams reasoning content into `/dev/null`:
+  - DeepSeek: `deepseek-reasoner`, `deepseek-chat` with `thinking=true`, `deepseek-v3.2-pro` thinking mode
+  - Alibaba DashScope: `qwen-qwq-32b`, `qwq-plus`, `qwen3-30b-a3b-thinking`, `qwen3-72b-thinking`, `qwen3-235b-a22b-thinking`
+  - vLLM-hosted: any model started with `--enable-reasoning --reasoning-parser deepseek_r1`
+  - SiliconFlow / OpenRouter / Together-passed reasoning models that follow the DeepSeek wire convention
+  - Future OpenAI o-series if/when OpenAI surfaces CoT through `delta.reasoning_content` (the public Responses API already exposes `reasoning.summary`; the path is the same shape)
+
+- Every claw consuming the SSE stream sees an empty `content_block_delta` window for the reasoning portion. A long DeepSeek-reasoner answer with 5 minutes of CoT and a 100-token final answer streams as: `MessageStart → ContentBlockStart(Text) → ContentBlockDelta(TextDelta {final answer text}) → ContentBlockStop → MessageDelta`. No `Thinking` block, no `ThinkingDelta`, no signal that the model spent five minutes reasoning. The output ledger shows `output_tokens` matching the final-answer length, even though billed `completion_tokens` from the upstream is 50× larger because reasoning tokens are billed too (the disconnect that #207 catches at the counter layer; #214 is the same disconnect at the content layer).
+
+- Hooks that would render a reasoning panel (sibling tool surface to claw-code's existing `--show-thinking` UX for Anthropic extended thinking) cannot fire on OpenAI-compat sessions. The TUI has no source for the data. There is no event for it.
+
+- Multi-turn conversations against `deepseek-reasoner` are correctly broken on the input side — DeepSeek docs explicitly require dropping `reasoning_content` from history to avoid 400 errors (https://api-docs.deepseek.com/guides/reasoning_model) — but claw-code never had it in the first place, so the input-side compliance is accidental. **However:** newer reasoning models (DeepSeek V4-Pro thinking mode, future OpenAI Responses API turns) require the *opposite* — `reasoning_content` MUST be passed back across turns when a tool was called in the previous turn. Without a parser claw-code structurally cannot comply with the newer contract; multi-turn tool-call sessions against V4-Pro will return 400 with no path to remediation short of upstream-version locking.
+
+- The Anthropic native path has full content-side parity. `sse.rs:260,288` parses `content_block_start` with `type: "thinking"` and `content_block_delta` with `delta.type: "thinking_delta"` into `OutputContentBlock::Thinking { thinking, signature }` and `ContentBlockDelta::ThinkingDelta { thinking }`. Tests at `sse.rs:243-296` assert both directions. The capability exists end-to-end on Anthropic. The OpenAI-compat translator has the destination types one import away and never bridges to them.
+
+**Gap:**
+
+1. **One canonical wire shape, three upstream spellings, zero claw-code reader.** DeepSeek emits `choices[0].delta.reasoning_content: "step text"` (sibling to `content`, since R1 release Jan 2026 and continuing through V3.2/V4-Pro). vLLM with `deepseek_r1` parser emits the same. SiliconFlow follows DeepSeek. OpenRouter wraps both. Some downstream proxies (LiteLLM, Helicone, Portkey) re-shape this into `delta.reasoning: { summary: "..."}` or `delta.thinking: "..."` to align with OpenAI's draft Responses API extension. Three wire spellings, no claw-code reader for any of them. Adding a top-level `Option<String> reasoning_content` to `ChunkDelta` covers the first two; absorbing the third requires `serde(alias = "reasoning")` plus an enum discriminator. Same one-shape-per-provider asymmetry as #213 (cached_tokens vs prompt_cache_hit_tokens) and #211 (max_tokens vs max_completion_tokens vs max_output_tokens).
+
+2. **Half-applied taxonomy within a single 30-line span.** `is_reasoning_model(model)` at line 780 already returns `true` for o1/o3/o4/grok-3-mini/qwen-qwq*/qwq*/*thinking*. At line 901, `build_chat_completion_request` consults this classifier to strip tuning params on the *request* side. The mirror call site for the *response* side — "if `is_reasoning_model(model)`, parse `delta.reasoning_content` into `ContentBlockDelta::ThinkingDelta`" — does not exist. The taxonomy knows which models reason; the deserializer was never wired up to act on that knowledge. Same shape as #211 (gpt-5-prefix gate at request side, no o-series gate at response side).
+
+3. **Local taxonomy already declares the destination type.** `rust/crates/api/src/types.rs:156-162` defines `OutputContentBlock::Thinking { thinking: String, signature: Option<String> }` and `OutputContentBlock::RedactedThinking { data: Value }`. Line 245 declares `ContentBlockDelta::ThinkingDelta { thinking: String }`. Both variants are emitted by `rust/crates/api/src/providers/anthropic.rs` via `sse.rs` parsing. The type slot exists, the emitter exists for one provider, the second provider has neither emitter nor reader. Adding the missing emitter is mechanical; the gap is structural absence, not design ambiguity.
+
+4. **Zero event for the dropped data.** No `reasoning_content_dropped` event. No `unsupported_chunk_field` event. No log line. No telemetry counter. A claw inspecting the SSE stream cannot tell whether the upstream model produced no reasoning, produced reasoning that was dropped, or is a non-reasoning model. Same opacity-pattern as #201 (silent tool-arg fallback), #202 (silent tool-message drop), #203 (no AutoCompactionEvent), #207 (silent zero-fill on usage), #208 (silent param strip), #211 (silent prefix-mismatch), #212 (no parallel-tool-emission event), #213 (silent zero-coercion on cache tokens). #214 extends the cluster to thirteen with the same shape: provider-side fact known, claw event taxonomy silent.
+
+5. **Tests exist for the Anthropic side, are absent for the OpenAI-compat side.** `rust/crates/api/src/sse.rs:243-296` has `parses_thinking_related_deltas` and a `parses_thinking_content_block_start` companion. Both assert that the SSE parser emits the `Thinking` content block and `ThinkingDelta` events. Searching the openai_compat module for analogous tests:
+
+```bash
+$ grep -n "fn .*reasoning\|fn .*thinking\|reasoning_content" rust/crates/api/src/providers/openai_compat.rs
+# (empty)
+```
+
+Zero tests. The asymmetry mirrors the production gap exactly — the test-coverage shape is `anthropic.thinking: present, openai_compat.reasoning_content: absent`. Same shape as #211 (no o-series test) and #212 (no parallel-tool modifier test) and #213 (no cached-token visibility test).
+
+6. **No CLI surface, no plugin override, no environment knob.** `grep -rn "show.thinking\|show.reasoning\|--reasoning\|--thinking" rust/crates/rusty-claude-cli/` returns hits only for the request-side `--reasoning-effort` flag (`main.rs:823, 925, 935`, etc. — the input parameter) and zero hits for any response-side reasoning-visibility flag. No `~/.claw/config.toml` entry for `[reasoning] show_chain_of_thought = true`. Plugins cannot inject the missing field through `extra_body` or post-process because the field is dropped at serde-deserialize time, before any plugin sees it. No surface area for users or operators to access reasoning content on OpenAI-compat paths. This mirrors the surface gap in #213 (no `--show-cache-stats`) and #207 (no `--show-reasoning-tokens`).
+
+7. **Compounding with the existing reasoning-model cluster.** #207 ("`completion_tokens_details.reasoning_tokens` not deserialized") catches the missing *count* of reasoning tokens at the usage layer. #211 ("`max_tokens` sent to o-series instead of `max_completion_tokens`") catches the missing *parameter routing* for reasoning models on the request side. #214 catches the missing *content stream* for reasoning models on the response side. The trio (#207 + #211 + #214) covers request → response → metering for the full reasoning-model lifecycle on the openai-compat path. All three independently broken; fixing only one or two leaves a half-functional reasoning lane. The deeper fix is a cluster-wide `ReasoningContract { request_param_name, response_content_field, usage_counter_field, multi_turn_passthrough_required }` table similar to the `MODEL_PARAM_REQUIREMENTS` shape recorded in #211/#212/#213.
+
+8. **External validation — every adjacent agent ships a reader for this field.**
+   - **anomalyco/opencode #24124** (active issue, verified 2026-04-25 via web search) tracks the multi-turn `reasoning_content` 400 error on `deepseek-reasoner`, confirming the field is industry-wide live wire traffic. The fix shipped in opencode parses `delta.reasoning_content` into a `reasoning` content part with `providerOptions.reasoning_content` round-trip support across turns.
+   - **charmbracelet/crush** parses `delta.reasoning_content` and routes it into the TUI's reasoning panel via `usage.reasoning_text` — surfaced behind a `--show-reasoning` flag with a config-file equivalent.
+   - **simonw/llm** exposes `--show-cot` (chain-of-thought) for any provider that returns `reasoning_content` or `reasoning`.
+   - **Vercel AI SDK** `LanguageModelV1Usage` extends with `reasoningTokens` and the message stream emits `reasoning` parts (https://ai-sdk.dev/docs/ai-sdk-core/generating-text#reasoning).
+   - **LangChain** `BaseChatOpenAI` returns `additional_kwargs.reasoning_content` for DeepSeek/Qwen reasoning models since v0.3.x.
+   - **vLLM** built-in `--reasoning-parser deepseek_r1` (https://docs.vllm.ai/en/latest/features/reasoning_outputs.html) standardizes the wire format so any downstream agent has one shape to read.
+   - **LiteLLM** wraps reasoning_content into a unified `provider_specific_fields.reasoning` slot and surfaces `--show-reasoning` per-call.
+   - **continue.dev** had the same gap filed at #9245 and shipped the fix.
+   - **siliconflow.cn**, **agentscope-ai/QwenPaw#3782**, **dataleadsfuture.com integration guide**: same wire shape, same parser, all shipped.
+   - claw-code is the only mainstream agent CLI in the cluster without any reader for `delta.reasoning_content`. The control surface is universal across the ecosystem and structurally absent here.
+
+**Repro (verified 2026-04-25 21:55 KST):**
+
+```bash
+# 1. Confirm zero hits for reasoning_content across the entire repo
+cd ~/clawd/claw-code
+grep -rn "reasoning_content" rust/ src/ tests/ docs/ 2>/dev/null
+# Output: (empty — verified)
+
+# 2. Confirm ChunkDelta lacks reasoning_content
+grep -A 6 "^struct ChunkDelta" rust/crates/api/src/providers/openai_compat.rs
+# struct ChunkDelta {
+#     #[serde(default)]
+#     content: Option<String>,
+#     #[serde(default, deserialize_with = "deserialize_null_as_empty_vec")]
+#     tool_calls: Vec<DeltaToolCall>,
+# }
+
+# 3. Confirm ChatMessage (non-streaming) lacks reasoning_content
+grep -A 6 "^struct ChatMessage" rust/crates/api/src/providers/openai_compat.rs
+# struct ChatMessage {
+#     role: String,
+#     #[serde(default)]
+#     content: Option<String>,
+#     #[serde(default)]
+#     tool_calls: Vec<ResponseToolCall>,
+# }
+
+# 4. Confirm Anthropic native path correctly emits ThinkingDelta
+grep -n "ContentBlockDelta::ThinkingDelta\|OutputContentBlock::Thinking" rust/crates/api/src/sse.rs
+# rust/crates/api/src/sse.rs:260:                    content_block: OutputContentBlock::Thinking {
+# rust/crates/api/src/sse.rs:288:                    delta: ContentBlockDelta::ThinkingDelta {
+
+# 5. Confirm the destination types are declared and ready
+grep -n "ThinkingDelta\|Thinking {" rust/crates/api/src/types.rs
+# rust/crates/api/src/types.rs:156:    Thinking {
+# rust/crates/api/src/types.rs:158:        thinking: String,
+# rust/crates/api/src/types.rs:245:    ThinkingDelta { thinking: String },
+
+# 6. Confirm is_reasoning_model already classifies the relevant families
+grep -A 14 "pub fn is_reasoning_model" rust/crates/api/src/providers/openai_compat.rs
+# pub fn is_reasoning_model(model: &str) -> bool {
+#     ...
+#     canonical.starts_with("o1") || canonical.starts_with("o3") || canonical.starts_with("o4")
+#         || canonical == "grok-3-mini"
+#         || canonical.starts_with("qwen-qwq") || canonical.starts_with("qwq")
+#         || canonical.contains("thinking")
+# }
+
+# 7. Confirm zero test coverage for reasoning_content on the openai-compat path
+grep -n "fn .*reasoning\|fn .*thinking" rust/crates/api/src/providers/openai_compat.rs
+# (empty — only the comment at line 790 mentions "Qwen3-Thinking family")
+```
+
+```rust
+// 8. Demonstrative tests that should exist and currently do not
+
+#[test]
+fn deepseek_reasoner_streaming_chunk_emits_thinking_delta() {
+    // DeepSeek reasoning-model wire format — sibling field at delta root
+    let chunk_json = r#"{
+        "id": "chatcmpl-deepseek-1",
+        "model": "deepseek-reasoner",
+        "choices": [{
+            "index": 0,
+            "delta": {
+                "role": "assistant",
+                "content": null,
+                "reasoning_content": "Let me think about this step by step. First, I need to..."
+            }
+        }]
+    }"#;
+    let chunk: ChatCompletionChunk = serde_json::from_str(chunk_json).unwrap();
+    let mut state = StreamState::new("deepseek-reasoner".to_string());
+    let events = state.ingest_chunk(chunk).unwrap();
+    // currently: events contains zero ThinkingDelta variants — bug
+    // expected: events contains ContentBlockStart(Thinking) followed by ThinkingDelta { thinking: "Let me think..." }
+    let has_thinking = events.iter().any(|e| matches!(e,
+        StreamEvent::ContentBlockDelta(ContentBlockDeltaEvent {
+            delta: ContentBlockDelta::ThinkingDelta { .. }, ..
+        })
+    ));
+    assert!(has_thinking, "reasoning_content should map to ThinkingDelta");
+}
+
+#[test]
+fn qwen3_thinking_non_streaming_response_carries_reasoning_block() {
+    // Alibaba DashScope wire format for Qwen3-Thinking — non-streaming
+    let response_json = r#"{
+        "id": "qwen-1",
+        "model": "qwen3-30b-a3b-thinking",
+        "choices": [{
+            "message": {
+                "role": "assistant",
+                "content": "The answer is 42.",
+                "reasoning_content": "I considered options A, B, C and concluded 42."
+            },
+            "finish_reason": "stop"
+        }],
+        "usage": { "prompt_tokens": 10, "completion_tokens": 50 }
+    }"#;
+    let response: ChatCompletionResponse = serde_json::from_str(response_json).unwrap();
+    let normalized = normalize_response(response, "qwen3-30b-a3b-thinking").unwrap();
+    // currently: normalized.content has only the Text block — bug
+    // expected: normalized.content has [Thinking { thinking: "I considered..." }, Text { text: "The answer is 42." }]
+    let has_thinking_block = normalized.content.iter().any(|b| matches!(b, OutputContentBlock::Thinking { .. }));
+    assert!(has_thinking_block, "reasoning_content should map to Thinking content block");
+}
+
+#[test]
+fn non_reasoning_model_with_no_reasoning_content_is_unaffected() {
+    // Backward compat — gpt-4o has no reasoning_content; behavior must be unchanged
+    let chunk_json = r#"{
+        "id": "chatcmpl-1",
+        "model": "gpt-4o",
+        "choices": [{ "index": 0, "delta": { "content": "hello" } }]
+    }"#;
+    let chunk: ChatCompletionChunk = serde_json::from_str(chunk_json).unwrap();
+    let mut state = StreamState::new("gpt-4o".to_string());
+    let events = state.ingest_chunk(chunk).unwrap();
+    // expected: ContentBlockStart(Text) + ContentBlockDelta(TextDelta) — same as today
+    let has_thinking = events.iter().any(|e| matches!(e,
+        StreamEvent::ContentBlockDelta(ContentBlockDeltaEvent {
+            delta: ContentBlockDelta::ThinkingDelta { .. }, ..
+        })
+    ));
+    assert!(!has_thinking, "non-reasoning models should not synthesize ThinkingDelta");
+}
+```
+
+**Fix shape (not implemented in this cycle, recorded for cluster refactor):**
+
+The minimal fix is a four-touch change: (a) add `#[serde(default)] reasoning_content: Option<String>` to `ChunkDelta` and `ChatMessage`, plus `#[serde(alias = "reasoning")]` for the proxy variant; (b) in `StreamState::ingest_chunk`, when `delta.reasoning_content` is `Some(text)` and non-empty, emit `ContentBlockStart(OutputContentBlock::Thinking { thinking: "" })` on first sight (separate block index from text) followed by `ContentBlockDelta(ContentBlockDelta::ThinkingDelta { thinking: text })`; (c) in `normalize_response`, when `message.reasoning_content` is `Some(text)`, prepend an `OutputContentBlock::Thinking { thinking: text, signature: None }` to the content vec; (d) add three regression tests covering streaming/non-streaming/backward-compat. Estimate: ~50 LOC production + ~80 LOC test. Plus a `StreamEvent::ReasoningContentReceived { count: u32 }` for cluster-wide event-emission parity (sibling fix to #201/#202/#203/#208/#211/#212/#213).
+
+The deeper fix is a unified-registry refactor (`MODEL_PARAM_REQUIREMENTS` table — sibling fix shape recorded in #211/#212/#213) that adds a sixth column `response_reasoning_field: ReasoningContent | Reasoning | None` describing how each provider exposes CoT, plus a cluster-wide `ChunkDelta::extract_reasoning(provider)` helper that owns the per-provider translation and emits the structured event. This closes #201/#202/#203/#206/#207/#208/#209/#210/#211/#212/#213/#214 in one structural change.
+
+**Status:** Open. No code changed. Filed 2026-04-25 22:00 KST. Branch: feat/jobdori-168c-emission-routing. HEAD: 347102d. Sibling-shape cluster (silent-fallback / silent-drop / silent-strip / silent-misnomer / silent-shadow / silent-prefix-mismatch / structural-absence / silent-zero-coercion / silent-content-discard at provider/CLI boundary): #201/#202/#203/#206/#207/#208/#209/#210/#211/#212/#213/#214 — thirteen pinpoints, one unified-registry refactor (`MODEL_PARAM_REQUIREMENTS` with six columns: `tuning_params_strip`, `max_output_tokens`, `max_tokens_param_name`, `default_parallel_tool_calls`, `cache_token_wire_shape`, `response_reasoning_field`) closes them all. Reasoning-fidelity cluster (the openai-compat reasoning-model lifecycle): #207 (reasoning_tokens counter) + #211 (max_completion_tokens param) + #214 (reasoning_content stream) — three pinpoints, one reasoning lane. Wire-format-parity cluster: #211 (max_tokens) + #212 (parallel_tool_calls) + #213 (cached_tokens) + #214 (reasoning_content) — four pinpoints, all upstream-contract divergence at the provider boundary. External validation: DeepSeek API docs (https://api-docs.deepseek.com/guides/reasoning_model + https://api-docs.deepseek.com/guides/thinking_mode), vLLM reasoning-outputs docs (https://docs.vllm.ai/en/latest/features/reasoning_outputs.html), anomalyco/opencode#24124 (active issue, identical pattern), charmbracelet/crush usage telemetry, simonw/llm `--show-cot`, Vercel AI SDK `LanguageModelV1Usage.reasoningTokens` + message stream `reasoning` parts, LangChain `BaseChatOpenAI` `additional_kwargs.reasoning_content`, LiteLLM `provider_specific_fields.reasoning`, continue.dev#9245, siliconflow.cn reasoning capabilities, agentscope-ai/QwenPaw#3782, dataleadsfuture.com R1 integration guide — same control surface available across the entire ecosystem, absent only in claw-code.
+
+🪨