claw-code

mirror of https://github.com/instructkr/claw-code.git synced 2026-06-10 08:22:14 +08:00

Author	SHA1	Message	Date
Jobdori	da451c66db	docs(roadmap): file /responses tool-schema compatibility bug as #33	2026-04-08 21:23:45 +09:00
Jobdori	ad38032ab8	docs(roadmap): file /responses tool-schema compatibility bug as #33	2026-04-08 21:23:37 +09:00
Jobdori	7173f2d6c6	docs(roadmap): file /responses tool-schema compatibility bug as #33	2026-04-08 21:23:28 +09:00
Jobdori	a0b4156174	docs(roadmap): file /responses tool-schema compatibility bug as #33	2026-04-08 21:23:20 +09:00
Jobdori	3bf45fc44a	docs(roadmap): file /responses tool-schema compatibility bug as #33	2026-04-08 21:23:12 +09:00
Jobdori	af58b6a7c7	docs(roadmap): file /responses tool-schema compatibility bug as #33	2026-04-08 21:23:04 +09:00
Jobdori	514c3da7ad	docs(roadmap): file /responses tool-schema compatibility bug as #33	2026-04-08 21:22:56 +09:00
Jobdori	5c69713158	docs(roadmap): file OpenAI-compat model-id passthrough gap as #32	2026-04-08 19:48:34 +09:00
Jobdori	939d0dbaa3	docs(roadmap): file OpenAI-compat model-id passthrough gap as #32	2026-04-08 19:48:28 +09:00
Jobdori	bfd5772716	docs(roadmap): file OpenAI-compat model-id passthrough gap as #32	2026-04-08 19:48:21 +09:00
Jobdori	e0c3ff1673	docs(roadmap): file executor-contract leaks as ROADMAP #31	2026-04-08 18:34:58 +09:00
Jobdori	252536be74	fix(tools): serialize web_search env-var tests with env_lock to prevent race web_search_extracts_and_filters_results set CLAWD_WEB_SEARCH_BASE_URL without holding env_lock(), while the sibling test web_search_handles_generic_links_and_invalid_base_url always held it. Under parallel test execution the two tests interleave set_var/remove_var calls, pointing the search client at the wrong mock server port and causing assertion failures. Fix: add env_lock() guard at the top of web_search_extracts_and_filters_results, matching the serialization pattern already used by every other env-mutating test in this module. Root cause of CI flake on run 24127551802. Identified and fixed during dogfood session.	2026-04-08 18:34:06 +09:00
Jobdori	275b58546d	feat(cli): populate Git SHA, target triple, and build date at compile time via build.rs Add rust/crates/rusty-claude-cli/build.rs that: - Captures git rev-parse --short HEAD at build time → GIT_SHA env - Reads Cargo's TARGET env var → TARGET env - Derives BUILD_DATE from SOURCE_DATE_EPOCH / BUILD_DATE env or the current date via `date +%Y-%m-%d` fallback - Registers rerun-if-changed on .git/HEAD and .git/refs so the SHA stays fresh across commits Update main.rs DEFAULT_DATE to pick up BUILD_DATE from option_env!() instead of the hardcoded 2026-03-31 static string. Before: `claw --version` always showed Git SHA: unknown, Target: unknown, Build date: 2026-03-31 in local builds. After: e.g. Git SHA: `7f53d82`, Target: aarch64-apple-darwin, Build date: 2026-04-08 Generated by droid (Kimi K2.5 Turbo) via acpx (wrote build.rs), cleaned up by Jobdori (added BUILD_DATE step, updated main.rs const). Co-Authored-By: Droid <noreply@factory.ai>	2026-04-08 18:11:46 +09:00
Jobdori	7f53d82b17	docs(roadmap): file DashScope routing fix as #30 (done at `adcea6b`)	2026-04-08 18:05:17 +09:00
Jobdori	adcea6bceb	fix(api): route DashScope models to dashscope config, not openai ProviderClient::from_model_with_anthropic_auth was dispatching every ProviderKind::OpenAi match to OpenAiCompatConfig::openai(), which reads OPENAI_API_KEY and points at api.openai.com. But DashScope models (qwen-plus, qwen/qwen3-coder, etc.) also return ProviderKind::OpenAi from detect_provider_kind because DashScope speaks the OpenAI wire format. The metadata layer correctly identifies them as needing DASHSCOPE_API_KEY and the DashScope compatible-mode endpoint, but that metadata was being ignored at dispatch time. Result: users running `claw --model qwen-plus` with DASHSCOPE_API_KEY set would get a "missing OPENAI_API_KEY" error instead of being routed to DashScope. Fix: consult providers::metadata_for_model in the OpenAi dispatch arm and pick dashscope() vs openai() based on metadata.auth_env. Adds a regression test asserting ProviderClient::from_model("qwen-plus") builds with the DashScope base URL. Exposes a pub base_url() accessor on OpenAiCompatClient so the test can verify the routing. Authored by droid (Kimi K2.5 Turbo) via acpx, cleaned up by Jobdori (removed unsafe blocks unnecessary under edition 2021, imported ProviderClient from super, adopted EnvVarGuard pattern from providers/mod.rs tests). Co-Authored-By: Droid <noreply@factory.ai>	2026-04-08 18:04:37 +09:00
YeonGyu-Kim	b1491791df	docs(roadmap): mark #21 and #29 as done #21 (Resumed /status JSON parity gap): resolved by the broader Resumed local-command JSON parity gap work tracked as #26. Re-verified on main HEAD `8dc6580` — the regression test passes. #29 (CLI provider dispatch hardcoded to Anthropic): landed at `8dc6580`. ApiProviderClient dispatch now routes correctly based on detect_provider_kind. Original filing preserved as trace record.	2026-04-08 17:43:47 +09:00
YeonGyu-Kim	8dc65805c1	fix(cli): dispatch to correct provider backend based on model prefix — closes ROADMAP #29 The CLI entry point (build_runtime_with_plugin_state in main.rs) was hardcoded to always instantiate AnthropicRuntimeClient with an AnthropicClient, regardless of what detect_provider_kind(model) returned. This meant `--model openai/gpt-4` with OPENAI_API_KEY set and no ANTHROPIC_* vars still failed with "missing Anthropic credentials" because the CLI never dispatched to the OpenAI-compat backend that already exists in the api crate. Root cause: AnthropicRuntimeClient.client was typed as AnthropicClient (concrete) rather than ApiProviderClient (enum). The api crate already had a ProviderClient enum with Anthropic / Xai / OpenAi variants that dispatches correctly via detect_provider_kind, plus a unified MessageStream enum that wraps both anthropic::MessageStream and openai_compat::MessageStream with the same next_event() -> StreamEvent interface. The CLI just wasn't using it. Changes (1 file, +59 -7): - Import api::ProviderClient as ApiProviderClient - Change AnthropicRuntimeClient.client from AnthropicClient to ApiProviderClient - In AnthropicRuntimeClient::new(), dispatch based on detect_provider_kind(&resolved_model): * Anthropic: build AnthropicClient directly with resolve_cli_auth_source() + api::read_base_url() + PromptCache (preserves ANTHROPIC_BASE_URL override for mock test harness and the session-scoped prompt cache) * xAI / OpenAi: delegate to ApiProviderClient::from_model_with_anthropic_auth which routes to OpenAiCompatClient::from_env with the matching config (reads OPENAI_API_KEY/XAI_API_KEY/DASHSCOPE_API_KEY and their BASE_URL overrides internally) - Change push_prompt_cache_record to take &ApiProviderClient (ProviderClient::take_last_prompt_cache_record returns None for non-Anthropic variants, so the helper is a no-op on OpenAI-compat providers without extra branching) What this unlocks for users: claw --model openai/gpt-4.1-mini prompt 'hello' # OpenAI claw --model grok-3 prompt 'hello' # xAI claw --model qwen-plus prompt 'hello' # DashScope OPENAI_BASE_URL=https://openrouter.ai/api/v1 \ claw --model openai/anthropic/claude-sonnet-4 prompt 'hello' # OpenRouter All previously broken, now routed correctly by prefix. Verification: - cargo build --release -p rusty-claude-cli: clean - cargo test --release -p rusty-claude-cli: 182 tests, 0 failures (including compact_output tests that exercise the Anthropic mock) - cargo fmt --all: clean - cargo clippy --workspace: warnings-only (pre-existing) - cargo test --release --workspace: all crates green except one pre-existing race in runtime::config::tests (passes in isolation) Source: live users nicma (1491342350960562277) and Jengro (1491345009021030533) in #claw-code on 2026-04-08.	2026-04-08 17:29:55 +09:00
YeonGyu-Kim	a9904fe693	docs(roadmap): file CLI provider dispatch bug as #29 , mark #28 as partial #28 error-copy improvements landed on `ff1df4c` but real users (nicma, Jengro) hit `error: missing Anthropic credentials` within hours when using `--model openai/gpt-4` with OPENAI_API_KEY set and all ANTHROPIC_* env vars unset on main. Traced root cause in build_runtime_with_plugin_state at line ~6244: AnthropicRuntimeClient::new() is hardcoded. BuiltRuntime is statically typed as ConversationRuntime<AnthropicRuntimeClient, ...>. providers::detect_provider_kind() computes the right routing at the metadata layer but the runtime client is never dispatched. Files #29 with the detailed trace + a focused action plan: DynamicApiClient enum wrapping Anthropic + OpenAiCompat variants, retype BuiltRuntime, dispatch in build_runtime based on detect_provider_kind, integration test with mock OpenAI-compat server. #28 is marked partial — the error-copy improvements are real and stayed in, but the routing gap they were meant to cover is the actual bug and needs #29 to land.	2026-04-08 17:01:14 +09:00
YeonGyu-Kim	ff1df4c7ac	fix(api): auth-provider error copy — prefix-routing hints + sk-ant-* bearer detection — closes ROADMAP #28 Two live users in #claw-code on 2026-04-08 hit adjacent auth confusion: varleg set OPENAI_API_KEY for OpenRouter but prefix routing didn't activate without openai/ model prefix, and stanley078852 put sk-ant-* in ANTHROPIC_AUTH_TOKEN (Bearer path) instead of ANTHROPIC_API_KEY (x-api-key path) and got 401 Invalid bearer token. Changes: 1. ApiError::MissingCredentials gained optional hint field (error.rs) 2. anthropic_missing_credentials_hint() sniffs OPENAI/XAI/DASHSCOPE env vars and suggests prefix routing when present (providers/mod.rs) 3. All 4 Anthropic auth paths wire the hint helper (anthropic.rs) 4. 401 + sk-ant-* in bearer token detected and hint appended 5. 'Which env var goes where' section added to USAGE.md Tests: unit tests for all three improvements (no HTTP calls needed). Workspace: all tests green, fmt clean, clippy warnings-only. Source: live users varleg + stanley078852 in #claw-code 2026-04-08. Co-authored-by: gaebal-gajae <gaebal-gajae@layofflabs.com>	2026-04-08 16:29:03 +09:00
YeonGyu-Kim	efa24edf21	docs(roadmap): file auth-provider truth pinpoint as backlog #28 Filed from live #claw-code dogfood on 2026-04-08 where two real users hit adjacent auth confusion within minutes: - varleg set OPENAI_API_KEY for OpenRouter but prefix routing didn't win because the model name wasn't prefixed with openai/; unsetting ANTHROPIC_API_KEY then hit MissingApiKey with no hint that the OpenAI path was already configured - stanley078852 put an sk-ant-* key in ANTHROPIC_AUTH_TOKEN instead of ANTHROPIC_API_KEY, causing claw to send it as Authorization: Bearer sk-ant-..., which Anthropic rejects at the edge with 401 Invalid bearer token Both fixes delivered live in #claw-code as direct replies, but the pattern is structural: the error surface doesn't bridge HTTP-layer symptoms back to env-var choice. Action block spells out a single main-side PR with three improvements: (a) MissingCredentials hint when an adjacent provider's env var is already set, (b) 401-on-Anthropic hint when bearer token starts with sk-ant-, (c) 'which env var goes where' paragraph in both README matrices mapping sk-ant-* -> x-api-key and OAuth access token -> Authorization: Bearer. All three improvements are unit-testable against ApiError::fmt output with no HTTP calls required.	2026-04-08 15:58:46 +09:00
YeonGyu-Kim	8339391611	docs(roadmap): correct #25 root cause — BrokenPipe tolerance, not chmod The original ROADMAP #25 entry claimed the root cause was missing exec bits on generated hook scripts. That was wrong — a chmod-only fix (4f7b674) still failed CI. The actual bug was output_with_stdin unconditionally propagating BrokenPipe from write_all when the child exits before the parent finishes writing stdin. Updated per gaebal-gajae's direction: actual fix, hygiene hardening, and regression guard are now clearly separated. Added a meta-lesson about Broken pipe ambiguity in fork/exec paths so future investigators don't cargo-cult the same wrong first theory.	2026-04-08 15:53:26 +09:00
YeonGyu-Kim	172a2ad50a	fix(plugins): chmod +x generated hook scripts + tolerate BrokenPipe in stdin write — closes ROADMAP #25 hotfix lane Two bugs found in the plugin hook test harness that together caused Linux CI to fail on 'hooks::tests::collects_and_runs_hooks_from_enabled_plugins' with 'Broken pipe (os error 32)'. Three reproductions plus one rerun failure on main today: 24120271422, 24120538408, 24121392171. Root cause 1 (chmod, defense-in-depth): write_hook_plugin writes pre.sh/post.sh/failure.sh via fs::write without setting the execute bit. While the runtime hook runner invokes hooks via 'sh <path>' (so the script file does not strictly need +x), missing exec perms can cause subtle fork/exec races on Linux in edge cases. Root cause 2 (the actual CI failure): output_with_stdin unconditionally propagated write_all errors on the child's stdin pipe, including BrokenPipe. A hook script that runs to completion in microseconds (e.g. a one-line printf) can exit and close its stdin before the parent finishes writing the JSON payload. Linux pipes surface this as EPIPE immediately; macOS pipes happen to buffer the small payload, so the race only shows on ubuntu CI runners. The parent's write_all raised BrokenPipe, which output_with_stdin returned as Err, which run_command classified as 'failed to start', making the test assertion fail. Fix: (a) make_executable helper sets mode 0o755 via PermissionsExt on each generated hook script, with a #[cfg(unix)] gate and a no-op #[cfg(not(unix))] branch. (b) output_with_stdin now matches the write_all result and swallows BrokenPipe specifically (the child still ran; wait_with_output still captures stdout/stderr/exit code), while propagating all other write errors. (c) New regression guard generated_hook_scripts_are_executable under #[cfg(unix)] asserts each generated .sh file has at least one execute bit set. Surgical scope per gaebal-gajae's direction: chmod + pipe tolerance + regression guard only. The deeper plugin-test sealing pass for ROADMAP #25 + #27 stays in gaebal-gajae's OMX lane. Verification: - cargo test --release -p plugins → 35 passing, 0 failing - cargo fmt -p plugins → clean - cargo clippy -p plugins -- -D warnings → clean Co-authored-by: gaebal-gajae <gaebal-gajae@layofflabs.com>	2026-04-08 15:48:20 +09:00
YeonGyu-Kim	647ff379a4	docs(roadmap): file dev/rust plugin-validation host-home leak as backlog #27 Filing per gaebal-gajae's status summary at message 1491322807026454579 in #clawcode-building-in-public, with corrected scope after re-running `cargo test -p rusty-claude-cli` against main HEAD (`79da4b8`): the 11 deterministic failures only reproduce on dev/rust, not main, so this is a dev/rust catchup item rather than a main regression. Two-layered root cause documented: 1. dev/rust `parse_args` eagerly validates user plugin hook scripts exist on disk before returning a CliAction 2. dev/rust test harness does not redirect $HOME/XDG_CONFIG_HOME to a fixture (no `env_lock` equivalent — main has 30+ env_lock hits, dev has zero) Together they make dev/rust `cargo test -p rusty-claude-cli` fail on any clean clone whose owner has a half-installed user plugin in ~/.claude/plugins/installed/. main has both the env_lock test isolation AND the parse_args/hook-validation decoupling already; dev/rust is just behind on the merge train. Action block in #27 spells out backporting env_lock + the parse_args decoupling so the next dev/rust release picks this up.	2026-04-08 15:30:04 +09:00
YeonGyu-Kim	79da4b8a63	docs(roadmap): record hooks test flake as P2 backlog item #25 Linux CI keeps tripping over `plugins::hooks::tests::collects_and_runs_hooks_from_enabled_plugins` with `Broken pipe (os error 32)` when the hook runner tries to spawn a child shell script that was written by `write_hook_plugin` without the execute bit set. Fails on first attempt, passes on rerun (observed in CI runs 24120271422 and 24120538408). Passes consistently on macOS. Since issues are disabled on the repo, recording as ROADMAP backlog item #25 in the Immediate Backlog P2 cluster next to the related plugin lifecycle flake at #24. Action block spells out the chmod +755 fix in `write_hook_plugin` plus the regression guard.	2026-04-08 15:10:13 +09:00
YeonGyu-Kim	7d90283cf9	docs(roadmap): record cascade-masking pinpoint under green-ness contract (#9 ) Concrete follow-up captured from today's dogfood session: A single hung test (oversized-request preflight, 6 minutes per attempt after `be561bf` silently swallowed count_tokens errors) crashed the `cargo test --workspace` job before downstream crates could run, hiding 6 separate pre-existing CLI regressions until `8c6dfe5` + `5851f2d` restored the fast-fail path. Two new acceptance criteria for #9: - per-test timeouts in CI so one hang cannot mask other failures - distinguish `test.hung` from generic test failures in worker reports	2026-04-08 15:03:30 +09:00
YeonGyu-Kim	5851f2dee8	fix(cli): 6 cascading test regressions hidden behind client_integration gate - compact flag: was parsed then discarded (`compact: _`) instead of passed to `run_turn_with_output` — hardcoded `false` meant --compact never took effect - piped stdin vs permission prompter: `read_piped_stdin()` consumed all stdin before `CliPermissionPrompter::decide()` could read interactive approval answers; now only consumes stdin as prompt context when permission mode is `DangerFullAccess` (fully unattended) - session resolver: `resolve_managed_session_path` and `list_managed_sessions` now fall back to the pre-isolation flat `.claw/sessions/` layout so legacy sessions remain accessible - help assertion: match on stable prefix after `/session delete` was added in batch 5 - prompt shorthand: fix copy-paste that changed expected prompt from "help me debug" to "$help overview" - mock parity harness: filter captured requests to `/v1/messages` path only, excluding count_tokens preflight calls added by `be561bf` All 6 failures were pre-existing but masked because `client_integration` always failed first (fixed in `8c6dfe5`). Workspace: 810+ tests passing, 0 failing.	2026-04-08 14:54:10 +09:00
YeonGyu-Kim	8c6dfe57e6	fix(api): restore local preflight guard ahead of count_tokens round-trip CI has been red since `be561bf` ('Use Anthropic count tokens for preflight') because that commit replaced the free-function preflight_message_request (byte-estimate guard) with an instance method that silently returns Ok on any count_tokens failure: let counted_input_tokens = match self.count_tokens(request).await { Ok(count) => count, Err(_) => return Ok(()), // <-- silent bypass }; Two consequences: 1. client_integration::send_message_blocks_oversized_requests_before_the_http_call has been FAILING on every CI run since `be561bf`. The mock server in that test only has one HTTP response queued (a bare '{}' to satisfy the main request), so the count_tokens POST parses into an empty body that fails to deserialize into CountTokensResponse -> Err -> silent bypass -> the oversized 600k-char request proceeds to the mock instead of being rejected with ContextWindowExceeded as the test expects. 2. In production, any third-party Anthropic-compatible gateway that doesn't implement /v1/messages/count_tokens (OpenRouter, Cloudflare AI Gateway, etc.) would silently disable the preflight guard entirely, letting oversized requests hit the upstream only to fail there with a provider- side context-window error. This is exactly the 'opaque failure surface' ROADMAP #22 asked us to avoid. Fix: call the free-function super::preflight_message_request(request)? as the first step in the instance method, before any network round-trip. This guarantees the byte-estimate guard always fires, whether or not the remote count_tokens endpoint is reachable. The count_tokens refinement still runs afterward when available for more precise token counting, but it is now strictly additive — it can only catch more cases, never silently skip the guard. Test results: - cargo test -p api --lib: 89 passed, 0 failed - cargo test --release -p api (all test binaries): 118 passed, 0 failed - cargo test --release -p api --test client_integration \ send_message_blocks_oversized_requests_before_the_http_call: passes - cargo fmt --check: clean This unblocks the Rust CI workflow which has been red on every push since `be561bf` landed.	2026-04-08 14:34:38 +09:00
YeonGyu-Kim	eed57212bb	docs(usage): add DashScope/Qwen section and prefix routing note Document the qwen/ and qwen- prefix routing added in `3ac97e6`. Users in Discord #clawcode-get-help (web3g, Renan Klehm, matthewblott) kept hitting ambient-credential misrouting because the docs only showed the OPENAI_BASE_URL pattern without explaining that model-name prefix wins over env-var presence. Added: - DashScope usage section with qwen/qwen-max and bare qwen-plus examples - DashScope row in provider matrix table - Reasoning model sanitization note (qwen-qwq, qwq-, -thinking) - Explicit statement that model-name prefix wins over ambient creds	2026-04-08 14:11:12 +09:00
YeonGyu-Kim	3ac97e635e	feat(api): add qwen/ prefix routing for Alibaba DashScope provider Users in Discord #clawcode-get-help (web3g) asked for Qwen 3.6 Plus via native Alibaba DashScope API instead of OpenRouter, which has stricter rate limits. This commit adds first-class routing for qwen/ and bare qwen- prefixed model names. Changes: - DEFAULT_DASHSCOPE_BASE_URL constant: /compatible-mode/v1 endpoint - OpenAiCompatConfig::dashscope() factory mirroring openai()/xai() - DASHSCOPE_ENV_VARS + credential_env_vars() wiring - metadata_for_model: qwen/ and qwen- prefix routes to DashScope with auth_env=DASHSCOPE_API_KEY, reuses ProviderKind::OpenAi because DashScope speaks the OpenAI REST shape - is_reasoning_model: detect qwen-qwq, qwq-, and -thinking variants so tuning params (temperature, top_p, etc.) get stripped before payload assembly (same pattern as o1/o3/grok-3-mini) Tests added: - providers::tests::qwen_prefix_routes_to_dashscope_not_anthropic - openai_compat::tests::qwen_reasoning_variants_are_detected 89 api lib tests passing, 0 failing. cargo fmt --check: clean. Closes the user-reported gap: 'use Qwen 3.6 Plus via Alibaba API directly, not OpenRouter' without needing OPENAI_BASE_URL override or unsetting ANTHROPIC_API_KEY.	2026-04-08 14:06:26 +09:00
YeonGyu-Kim	006f7d7ee6	fix(test): add env_lock to plugin lifecycle test — closes ROADMAP #24 build_runtime_runs_plugin_lifecycle_init_and_shutdown was the only test that set/removed ANTHROPIC_API_KEY without holding the env_lock mutex. Under parallel workspace execution, other tests racing on the same env var could wipe the key mid-construction, causing a flaky credential error. Root cause: process-wide env vars are shared mutable state. All other tests that touch ANTHROPIC_API_KEY already use env_lock(). This test was the only holdout. Fix: add let _guard = env_lock(); at the top of the test.	2026-04-08 12:46:04 +09:00
YeonGyu-Kim	82baaf3f22	fix(ci): update integration test MessageRequest initializers for new tuning fields openai_compat_integration.rs and client_integration.rs had MessageRequest constructions without the new tuning param fields (temperature, top_p, frequency_penalty, presence_penalty, stop) added in `c667d47`. Added ..Default::default() to all 4 sites. cargo fmt applied. This was the root cause of CI red on main (E0063 compile error in integration tests, not caught by --lib tests).	2026-04-08 11:43:51 +09:00
YeonGyu-Kim	c7b3296ef6	style: cargo fmt — fix CI formatting failures Pre-existing formatting issues in anthropic.rs surfaced by CI cargo fmt check. No functional changes.	2026-04-08 11:21:13 +09:00
YeonGyu-Kim	000aed4188	fix(commands): fix brittle /session help assertion after delete subcommand addition renders_help_from_shared_specs hardcoded the exact /session usage string, which broke when /session delete was added in batch 5. Relaxed to check for /session presence instead of exact subcommand list. Pre-existing test brittleness (not caused by recent commits). 687 workspace lib tests passing, 0 failing.	2026-04-08 09:33:51 +09:00
YeonGyu-Kim	523ce7474a	fix(api): sanitize Anthropic body — strip frequency/presence_penalty, convert stop→stop_sequences MessageRequest now carries OpenAI-compatible tuning params (`c667d47`), but the Anthropic API does not support frequency_penalty or presence_penalty, and uses 'stop_sequences' instead of 'stop'. Without this fix, setting these params with a Claude model would produce 400 errors. Changes to strip_unsupported_beta_body_fields: - Remove frequency_penalty and presence_penalty from Anthropic request body - Convert stop → stop_sequences (only when non-empty) - temperature and top_p are preserved (Anthropic supports both) Tests added: - strip_removes_openai_only_fields_and_converts_stop - strip_does_not_add_empty_stop_sequences 87 api lib tests passing, 0 failing. cargo check --workspace: clean.	2026-04-08 09:05:10 +09:00
YeonGyu-Kim	b513d6e462	fix(api): sanitize tuning params for reasoning models (o1/o3/grok-3-mini) Reasoning models reject temperature, top_p, frequency_penalty, and presence_penalty with 400 errors. Instead of letting these flow through and returning cryptic provider errors, strip them silently at the request-builder boundary. is_reasoning_model() classifies: o1, o3, o4*, grok-3-mini. stop sequences are preserved (safe for all providers). Tests added: - reasoning_model_strips_tuning_params: o1-mini strips all 4 params, keeps stop - grok_3_mini_is_reasoning_model: classification coverage for grok-3-mini, o1, o3-mini, and negative cases (gpt-4o, grok-3, claude) 85 api lib tests passing, 0 failing.	2026-04-08 07:32:47 +09:00
YeonGyu-Kim	c667d47c70	feat(api): add tuning params (temperature, top_p, penalties, stop) to MessageRequest MessageRequest was missing standard OpenAI-compatible generation tuning parameters. Callers had no way to control temperature, top_p, frequency_penalty, presence_penalty, or stop sequences. Changes: - Added 5 optional fields to MessageRequest (all Option, None by default) - Wired into build_chat_completion_request: only included in payload when set - All existing construction sites updated with ..Default::default() - MessageRequest now derives Default for ergonomic partial construction Tests added: - tuning_params_included_in_payload_when_set: all 5 params flow into JSON - tuning_params_omitted_from_payload_when_none: absent params stay absent 83 api lib tests passing, 0 failing. cargo check --workspace: 0 warnings.	2026-04-08 07:07:33 +09:00
YeonGyu-Kim	7546c1903d	docs(roadmap): document provider routing fix and auth-sniffer fragility lesson Filed: openai/ prefix model misrouting (fixed in `0530c50`). Documents root cause, fix, and the architectural lesson: - metadata_for_model is the canonical extension point for new providers - auth-sniffer fallback order must never override explicit model-name prefix - regression test locked in to guard this invariant	2026-04-08 05:35:12 +09:00
YeonGyu-Kim	0530c509a3	fix(api): route openai/ and gpt- model prefixes to OpenAi provider metadata_for_model returned None for unknown models like openai/gpt-4.1-mini, causing detect_provider_kind to fall through to auth-sniffer order. If ANTHROPIC_API_KEY was set, the model was silently misrouted to Anthropic and the user got a confusing 'missing Anthropic credentials' error. Fix: add explicit prefix checks for 'openai/' and 'gpt-' in metadata_for_model so the model name wins over env-var presence. Regression test added: openai_namespaced_model_routes_to_openai_not_anthropic - 'openai/gpt-4.1-mini' routes to OpenAi - 'gpt-4o' routes to OpenAi Reported and reproduced by gaebal-gajae against current main. 81 api lib tests passing, 0 failing.	2026-04-08 05:33:47 +09:00
YeonGyu-Kim	eff0765167	test(tools): fill WorkerGet and error-path coverage gaps WorkerGet had zero test coverage. WorkerAwaitReady and WorkerSendPrompt had only one happy-path test each with no error paths. Added 4 tests: - worker_get_returns_worker_state: WorkerGet fetches correct worker_id/status/cwd - worker_get_on_unknown_id_returns_error: unknown id -> 'worker not found' - worker_await_ready_on_spawning_worker_returns_not_ready: ready=false on spawning worker - worker_send_prompt_on_non_ready_worker_returns_error: sending prompt before ready fails 94 tool tests passing, 0 failing.	2026-04-08 05:03:34 +09:00
YeonGyu-Kim	aee5263aef	test(tools): prove recovery loop against .claw/worker-state.json directly recovery_loop_state_file_reflects_transitions reads the actual state file after each transition to verify the canonical observability surface reflects the full stall->resolve->ready progression: spawning (state file exists, seconds_since_update present) -> trust_required (is_ready=false, trust_gate_cleared=false in file) -> spawning (trust_gate_cleared=true after WorkerResolveTrust) -> ready_for_prompt (is_ready=true after ready screen observe) This is the end-to-end proof gaebal-gajae called for: clawhip polling .claw/worker-state.json will see truthful state at every step of the recovery loop, including the seconds_since_update staleness signal. 90 tool tests passing, 0 failing.	2026-04-08 04:38:38 +09:00
YeonGyu-Kim	9461522af5	feat(tools): expose WorkerObserveCompletion tool; add provider-degraded classification tests observe_completion() on WorkerRegistry classifies finish_reason into Finished vs Failed (finish='unknown' + 0 tokens = provider degraded). This logic existed in the runtime but had no tool wrapper — clawhip could not call it. Added WorkerObserveCompletion as a first-class tool. Tool schema: { worker_id, finish_reason: string, tokens_output: integer } Handler: run_worker_observe_completion -> global_worker_registry().observe_completion() Tests added: - worker_observe_completion_success_finish_sets_finished_status finish=end_turn + tokens=512 -> status=finished - worker_observe_completion_degraded_provider_sets_failed_status finish=unknown + tokens=0 -> status=failed, last_error populated 89 tool tests passing, 0 failing.	2026-04-08 04:35:05 +09:00
YeonGyu-Kim	c08f060ca1	test(tools): end-to-end stall-detect and recovery loop coverage Proves the clawhip restart/recover flow that gaebal-gajae flagged: 1. stall_detect_and_resolve_trust_end_to_end - Worker created without trusted_roots -> trust_auto_resolve=false - WorkerObserve with trust-prompt text -> status=trust_required, gate cleared=false - WorkerResolveTrust -> status=spawning, trust_gate_cleared=true - WorkerObserve with ready text -> status=ready_for_prompt Full resolve path verified end-to-end. 2. stall_detect_and_restart_recovery_end_to_end - Worker stalls at trust_required - WorkerRestart resets to spawning, trust_gate_cleared=false Documents the restart-then-re-acquire-trust flow. Note: seconds_since_update is in .claw/worker-state.json (state file), not in the Worker tool output struct. Staleness detection via state file is covered by emit_state_file_writes_worker_status_on_transition in worker_boot.rs tests. 87 tool tests passing, 0 failing.	2026-04-08 04:09:55 +09:00
YeonGyu-Kim	cae11413dd	fix(dead-code): remove stale constants + dead function; add workspace_sessions_dir tests Three dead-code warnings eliminated from cargo check: 1. KNOWN_TOP_LEVEL_KEYS / DEPRECATED_TOP_LEVEL_KEYS in config.rs - Superseded by config_validate::TOP_LEVEL_FIELDS and DEPRECATED_FIELDS - Were out of date (missing aliases, providerFallbacks, trustedRoots) - Removed 2. read_git_recent_commits in prompt.rs - Private function, never called anywhere in the codebase - Removed 3. workspace_sessions_dir in session.rs - Public API scaffolded for session isolation (#41) - Genuinely useful for external consumers (clawhip enumerating sessions) - Added 2 tests: deterministic path for same CWD, different path for different CWDs - Annotated with #[allow(dead_code)] since it is external-facing API cargo check --workspace: 0 warnings remaining 430 runtime tests passing, 0 failing	2026-04-08 04:04:54 +09:00
YeonGyu-Kim	60410b6c92	docs(roadmap): settle observability transport — CLI/file is canonical, HTTP deferred Closes the ambiguity gaebal-gajae flagged: downstream tooling was left guessing which integration surface to build against. Decision: claw state + .claw/worker-state.json is the blessed contract. HTTP endpoint not scheduled. Rationale documented: - plugin scope constraint (can't add routes to opencode serve) - file polling has lower latency and fewer failure modes than HTTP - HTTP would require upstreaming to sst/opencode or a fragile sidecar Clawhip integration contract documented: - poll .claw/worker-state.json after WorkerCreate - seconds_since_update > 60 in trust_required = stall signal - WorkerResolveTrust to unblock, WorkerRestart to reset	2026-04-08 03:34:31 +09:00
YeonGyu-Kim	aa37dc6936	test(tools): add coverage for WorkerRestart and WorkerTerminate tools WorkerRestart and WorkerTerminate had zero test coverage despite being public tools in the tool spec. Also confirms one design decision worth noting: restart resets trust_gate_cleared=false, so an allowlisted worker that gets restarted must re-acquire trust via the normal observe flow (by design — trust is per-session, not per-CWD). Tests added: - worker_terminate_sets_finished_status - worker_restart_resets_to_spawning (verifies status=spawning, prompt_in_flight=false, trust_gate_cleared=false) - worker_terminate_on_unknown_id_returns_error - worker_restart_on_unknown_id_returns_error 85 tool tests passing, 0 failing.	2026-04-08 03:33:05 +09:00
YeonGyu-Kim	6ddfa78b7c	feat(tools): wire config.trusted_roots into WorkerCreate tool Previously WorkerCreate passed trusted_roots directly to spawn_worker with no config-level default. Any batch script omitting the field stalled all workers at TrustRequired with no recovery path. Now run_worker_create loads RuntimeConfig from the worker CWD before spawning and merges config.trusted_roots() with per-call overrides. Per-call overrides still take effect; config provides the default. Add test: worker_create_merges_config_trusted_roots_without_per_call_override - writes .claw/settings.json with trustedRoots=[<os-temp-dir>] in a temp worktree - calls WorkerCreate with no trusted_roots field - asserts trust_auto_resolve=true (config roots matched the CWD) 81 tool tests passing, 0 failing.	2026-04-08 03:08:13 +09:00
YeonGyu-Kim	bcdc52d72c	feat(config): add trustedRoots to RuntimeConfig Closes the startup-friction gap filed in ROADMAP (`dd97c49`). WorkerCreate required trusted_roots on every call with no config-level default. Any batch script that omitted the field stalled all workers at TrustRequired with no auto-recovery path. Changes: - RuntimeFeatureConfig: add trusted_roots: Vec<String> field - ConfigLoader: wire parse_optional_trusted_roots() for 'trustedRoots' key - RuntimeConfig / RuntimeFeatureConfig: expose trusted_roots() accessor - config_validate: add trustedRoots to TOP_LEVEL_FIELDS schema (StringArray) - Tests: parses_trusted_roots_from_settings + trusted_roots_default_is_empty_when_unset Callers can now set trusted_roots in .claw/settings.json: { "trustedRoots": ["/tmp/worktrees"] } WorkerRegistry::spawn_worker() callers should merge config.trusted_roots() with any per-call overrides (wiring left for follow-up).	2026-04-08 02:35:19 +09:00
YeonGyu-Kim	dd97c49e6b	docs(roadmap): file startup-friction gap — no default trusted_roots in settings WorkerCreate requires trusted_roots per-call; no config-level default. Any batch that forgets the field stalls all workers at trust_required. Root cause of several 'batch lanes not advancing' incidents. Recommended fix: wire RuntimeConfig::trusted_roots() as default into WorkerRegistry::spawn_worker(), with per-call overrides. Update config_validate schema to include the new field.	2026-04-08 02:02:48 +09:00
YeonGyu-Kim	5dfb1d7c2b	fix(config_validate): add missing aliases/providerFallbacks to schema; fix deprecated-key bypass Two real schema gaps found via dogfood (cargo test -p runtime): 1. aliases and providerFallbacks not in TOP_LEVEL_FIELDS - Both are valid config keys parsed by config.rs - Validator was rejecting them as unknown keys - 2 tests failing: parses_user_defined_model_aliases, parses_provider_fallbacks_chain 2. Deprecated keys were being flagged as unknown before the deprecated check ran (unknown-key check runs first in validate_object_keys) - Added early-exit for deprecated keys in unknown-key loop - Keeps deprecated→warning behavior for permissionMode/enabledPlugins which still appear in valid legacy configs 3. Config integration tests had assertions on format strings that never matched the actual validator output (path:3: vs path: ... (line N)) - Updated assertions to check for path + line + field name as independent substrings instead of a format that was never produced 426 tests passing, 0 failing.	2026-04-08 01:45:08 +09:00
YeonGyu-Kim	fcb5d0c16a	fix(worker_boot): add seconds_since_update to state snapshot Clawhip needs to distinguish a stalled trust_required worker from one that just transitioned. Without a pre-computed staleness field it has to compute epoch delta itself from updated_at. seconds_since_update = now - updated_at at snapshot write time. Clawhip threshold: > 60s in trust_required = stalled; act.	2026-04-08 01:03:00 +09:00

1 2 3 4 5 ...

675 Commits