Commit Graph

648 Commits

Author SHA1 Message Date
YeonGyu-Kim
eed57212bb docs(usage): add DashScope/Qwen section and prefix routing note
Document the qwen/ and qwen- prefix routing added in 3ac97e6. Users
in Discord #clawcode-get-help (web3g, Renan Klehm, matthewblott) kept
hitting ambient-credential misrouting because the docs only showed
the OPENAI_BASE_URL pattern without explaining that model-name prefix
wins over env-var presence.

Added:
- DashScope usage section with qwen/qwen-max and bare qwen-plus examples
- DashScope row in provider matrix table
- Reasoning model sanitization note (qwen-qwq, qwq-*, *-thinking)
- Explicit statement that model-name prefix wins over ambient creds
2026-04-08 14:11:12 +09:00
YeonGyu-Kim
3ac97e635e feat(api): add qwen/ prefix routing for Alibaba DashScope provider
Users in Discord #clawcode-get-help (web3g) asked for Qwen 3.6 Plus via
native Alibaba DashScope API instead of OpenRouter, which has stricter
rate limits. This commit adds first-class routing for qwen/ and bare
qwen- prefixed model names.

Changes:
- DEFAULT_DASHSCOPE_BASE_URL constant: /compatible-mode/v1 endpoint
- OpenAiCompatConfig::dashscope() factory mirroring openai()/xai()
- DASHSCOPE_ENV_VARS + credential_env_vars() wiring
- metadata_for_model: qwen/ and qwen- prefix routes to DashScope with
  auth_env=DASHSCOPE_API_KEY, reuses ProviderKind::OpenAi because
  DashScope speaks the OpenAI REST shape
- is_reasoning_model: detect qwen-qwq, qwq-*, and *-thinking variants
  so tuning params (temperature, top_p, etc.) get stripped before
  payload assembly (same pattern as o1/o3/grok-3-mini)

Tests added:
- providers::tests::qwen_prefix_routes_to_dashscope_not_anthropic
- openai_compat::tests::qwen_reasoning_variants_are_detected

89 api lib tests passing, 0 failing. cargo fmt --check: clean.

Closes the user-reported gap: 'use Qwen 3.6 Plus via Alibaba API
directly, not OpenRouter' without needing OPENAI_BASE_URL override
or unsetting ANTHROPIC_API_KEY.
2026-04-08 14:06:26 +09:00
YeonGyu-Kim
006f7d7ee6 fix(test): add env_lock to plugin lifecycle test — closes ROADMAP #24
build_runtime_runs_plugin_lifecycle_init_and_shutdown was the only test
that set/removed ANTHROPIC_API_KEY without holding the env_lock mutex.
Under parallel workspace execution, other tests racing on the same env
var could wipe the key mid-construction, causing a flaky credential error.

Root cause: process-wide env vars are shared mutable state. All other
tests that touch ANTHROPIC_API_KEY already use env_lock(). This test
was the only holdout.

Fix: add let _guard = env_lock(); at the top of the test.
2026-04-08 12:46:04 +09:00
YeonGyu-Kim
82baaf3f22 fix(ci): update integration test MessageRequest initializers for new tuning fields
openai_compat_integration.rs and client_integration.rs had MessageRequest
constructions without the new tuning param fields (temperature, top_p,
frequency_penalty, presence_penalty, stop) added in c667d47.

Added ..Default::default() to all 4 sites. cargo fmt applied.

This was the root cause of CI red on main (E0063 compile error in
integration tests, not caught by --lib tests).
2026-04-08 11:43:51 +09:00
YeonGyu-Kim
c7b3296ef6 style: cargo fmt — fix CI formatting failures
Pre-existing formatting issues in anthropic.rs surfaced by CI cargo fmt check.
No functional changes.
2026-04-08 11:21:13 +09:00
YeonGyu-Kim
000aed4188 fix(commands): fix brittle /session help assertion after delete subcommand addition
renders_help_from_shared_specs hardcoded the exact /session usage string,
which broke when /session delete was added in batch 5. Relaxed to check
for /session presence instead of exact subcommand list.

Pre-existing test brittleness (not caused by recent commits).

687 workspace lib tests passing, 0 failing.
2026-04-08 09:33:51 +09:00
YeonGyu-Kim
523ce7474a fix(api): sanitize Anthropic body — strip frequency/presence_penalty, convert stop→stop_sequences
MessageRequest now carries OpenAI-compatible tuning params (c667d47), but
the Anthropic API does not support frequency_penalty or presence_penalty,
and uses 'stop_sequences' instead of 'stop'. Without this fix, setting
these params with a Claude model would produce 400 errors.

Changes to strip_unsupported_beta_body_fields:
- Remove frequency_penalty and presence_penalty from Anthropic request body
- Convert stop → stop_sequences (only when non-empty)
- temperature and top_p are preserved (Anthropic supports both)

Tests added:
- strip_removes_openai_only_fields_and_converts_stop
- strip_does_not_add_empty_stop_sequences

87 api lib tests passing, 0 failing.
cargo check --workspace: clean.
2026-04-08 09:05:10 +09:00
YeonGyu-Kim
b513d6e462 fix(api): sanitize tuning params for reasoning models (o1/o3/grok-3-mini)
Reasoning models reject temperature, top_p, frequency_penalty, and
presence_penalty with 400 errors. Instead of letting these flow through
and returning cryptic provider errors, strip them silently at the
request-builder boundary.

is_reasoning_model() classifies: o1*, o3*, o4*, grok-3-mini.
stop sequences are preserved (safe for all providers).

Tests added:
- reasoning_model_strips_tuning_params: o1-mini strips all 4 params, keeps stop
- grok_3_mini_is_reasoning_model: classification coverage for grok-3-mini, o1,
  o3-mini, and negative cases (gpt-4o, grok-3, claude)

85 api lib tests passing, 0 failing.
2026-04-08 07:32:47 +09:00
YeonGyu-Kim
c667d47c70 feat(api): add tuning params (temperature, top_p, penalties, stop) to MessageRequest
MessageRequest was missing standard OpenAI-compatible generation tuning
parameters. Callers had no way to control temperature, top_p,
frequency_penalty, presence_penalty, or stop sequences.

Changes:
- Added 5 optional fields to MessageRequest (all Option, None by default)
- Wired into build_chat_completion_request: only included in payload when set
- All existing construction sites updated with ..Default::default()
- MessageRequest now derives Default for ergonomic partial construction

Tests added:
- tuning_params_included_in_payload_when_set: all 5 params flow into JSON
- tuning_params_omitted_from_payload_when_none: absent params stay absent

83 api lib tests passing, 0 failing.
cargo check --workspace: 0 warnings.
2026-04-08 07:07:33 +09:00
YeonGyu-Kim
7546c1903d docs(roadmap): document provider routing fix and auth-sniffer fragility lesson
Filed: openai/ prefix model misrouting (fixed in 0530c50).
Documents root cause, fix, and the architectural lesson:
  - metadata_for_model is the canonical extension point for new providers
  - auth-sniffer fallback order must never override explicit model-name prefix
  - regression test locked in to guard this invariant
2026-04-08 05:35:12 +09:00
YeonGyu-Kim
0530c509a3 fix(api): route openai/ and gpt- model prefixes to OpenAi provider
metadata_for_model returned None for unknown models like openai/gpt-4.1-mini,
causing detect_provider_kind to fall through to auth-sniffer order. If
ANTHROPIC_API_KEY was set, the model was silently misrouted to Anthropic
and the user got a confusing 'missing Anthropic credentials' error.

Fix: add explicit prefix checks for 'openai/' and 'gpt-' in
metadata_for_model so the model name wins over env-var presence.

Regression test added: openai_namespaced_model_routes_to_openai_not_anthropic
- 'openai/gpt-4.1-mini' routes to OpenAi
- 'gpt-4o' routes to OpenAi

Reported and reproduced by gaebal-gajae against current main.
81 api lib tests passing, 0 failing.
2026-04-08 05:33:47 +09:00
YeonGyu-Kim
eff0765167 test(tools): fill WorkerGet and error-path coverage gaps
WorkerGet had zero test coverage. WorkerAwaitReady and WorkerSendPrompt
had only one happy-path test each with no error paths.

Added 4 tests:
- worker_get_returns_worker_state: WorkerGet fetches correct worker_id/status/cwd
- worker_get_on_unknown_id_returns_error: unknown id -> 'worker not found'
- worker_await_ready_on_spawning_worker_returns_not_ready: ready=false on spawning worker
- worker_send_prompt_on_non_ready_worker_returns_error: sending prompt before ready fails

94 tool tests passing, 0 failing.
2026-04-08 05:03:34 +09:00
YeonGyu-Kim
aee5263aef test(tools): prove recovery loop against .claw/worker-state.json directly
recovery_loop_state_file_reflects_transitions reads the actual state
file after each transition to verify the canonical observability surface
reflects the full stall->resolve->ready progression:

  spawning (state file exists, seconds_since_update present)
  -> trust_required (is_ready=false, trust_gate_cleared=false in file)
  -> spawning (trust_gate_cleared=true after WorkerResolveTrust)
  -> ready_for_prompt (is_ready=true after ready screen observe)

This is the end-to-end proof gaebal-gajae called for: clawhip polling
.claw/worker-state.json will see truthful state at every step of the
recovery loop, including the seconds_since_update staleness signal.

90 tool tests passing, 0 failing.
2026-04-08 04:38:38 +09:00
YeonGyu-Kim
9461522af5 feat(tools): expose WorkerObserveCompletion tool; add provider-degraded classification tests
observe_completion() on WorkerRegistry classifies finish_reason into
Finished vs Failed (finish='unknown' + 0 tokens = provider degraded).
This logic existed in the runtime but had no tool wrapper — clawhip
could not call it. Added WorkerObserveCompletion as a first-class tool.

Tool schema:
  { worker_id, finish_reason: string, tokens_output: integer }

Handler: run_worker_observe_completion -> global_worker_registry().observe_completion()

Tests added:
- worker_observe_completion_success_finish_sets_finished_status
  finish=end_turn + tokens=512 -> status=finished
- worker_observe_completion_degraded_provider_sets_failed_status
  finish=unknown + tokens=0 -> status=failed, last_error populated

89 tool tests passing, 0 failing.
2026-04-08 04:35:05 +09:00
YeonGyu-Kim
c08f060ca1 test(tools): end-to-end stall-detect and recovery loop coverage
Proves the clawhip restart/recover flow that gaebal-gajae flagged:

1. stall_detect_and_resolve_trust_end_to_end
   - Worker created without trusted_roots -> trust_auto_resolve=false
   - WorkerObserve with trust-prompt text -> status=trust_required, gate cleared=false
   - WorkerResolveTrust -> status=spawning, trust_gate_cleared=true
   - WorkerObserve with ready text -> status=ready_for_prompt
   Full resolve path verified end-to-end.

2. stall_detect_and_restart_recovery_end_to_end
   - Worker stalls at trust_required
   - WorkerRestart resets to spawning, trust_gate_cleared=false
   Documents the restart-then-re-acquire-trust flow.

Note: seconds_since_update is in .claw/worker-state.json (state file),
not in the Worker tool output struct. Staleness detection via state file
is covered by emit_state_file_writes_worker_status_on_transition in
worker_boot.rs tests.

87 tool tests passing, 0 failing.
2026-04-08 04:09:55 +09:00
YeonGyu-Kim
cae11413dd fix(dead-code): remove stale constants + dead function; add workspace_sessions_dir tests
Three dead-code warnings eliminated from cargo check:

1. KNOWN_TOP_LEVEL_KEYS / DEPRECATED_TOP_LEVEL_KEYS in config.rs
   - Superseded by config_validate::TOP_LEVEL_FIELDS and DEPRECATED_FIELDS
   - Were out of date (missing aliases, providerFallbacks, trustedRoots)
   - Removed

2. read_git_recent_commits in prompt.rs
   - Private function, never called anywhere in the codebase
   - Removed

3. workspace_sessions_dir in session.rs
   - Public API scaffolded for session isolation (#41)
   - Genuinely useful for external consumers (clawhip enumerating sessions)
   - Added 2 tests: deterministic path for same CWD, different path for different CWDs
   - Annotated with #[allow(dead_code)] since it is external-facing API

cargo check --workspace: 0 warnings remaining
430 runtime tests passing, 0 failing
2026-04-08 04:04:54 +09:00
YeonGyu-Kim
60410b6c92 docs(roadmap): settle observability transport — CLI/file is canonical, HTTP deferred
Closes the ambiguity gaebal-gajae flagged: downstream tooling was left
guessing which integration surface to build against.

Decision: claw state + .claw/worker-state.json is the blessed contract.
HTTP endpoint not scheduled. Rationale documented:
- plugin scope constraint (can't add routes to opencode serve)
- file polling has lower latency and fewer failure modes than HTTP
- HTTP would require upstreaming to sst/opencode or a fragile sidecar

Clawhip integration contract documented:
- poll .claw/worker-state.json after WorkerCreate
- seconds_since_update > 60 in trust_required = stall signal
- WorkerResolveTrust to unblock, WorkerRestart to reset
2026-04-08 03:34:31 +09:00
YeonGyu-Kim
aa37dc6936 test(tools): add coverage for WorkerRestart and WorkerTerminate tools
WorkerRestart and WorkerTerminate had zero test coverage despite being
public tools in the tool spec. Also confirms one design decision worth
noting: restart resets trust_gate_cleared=false, so an allowlisted
worker that gets restarted must re-acquire trust via the normal observe
flow (by design — trust is per-session, not per-CWD).

Tests added:
- worker_terminate_sets_finished_status
- worker_restart_resets_to_spawning (verifies status=spawning,
  prompt_in_flight=false, trust_gate_cleared=false)
- worker_terminate_on_unknown_id_returns_error
- worker_restart_on_unknown_id_returns_error

85 tool tests passing, 0 failing.
2026-04-08 03:33:05 +09:00
YeonGyu-Kim
6ddfa78b7c feat(tools): wire config.trusted_roots into WorkerCreate tool
Previously WorkerCreate passed trusted_roots directly to spawn_worker
with no config-level default. Any batch script omitting the field
stalled all workers at TrustRequired with no recovery path.

Now run_worker_create loads RuntimeConfig from the worker CWD before
spawning and merges config.trusted_roots() with per-call overrides.
Per-call overrides still take effect; config provides the default.

Add test: worker_create_merges_config_trusted_roots_without_per_call_override
- writes .claw/settings.json with trustedRoots=[<os-temp-dir>] in a temp worktree
- calls WorkerCreate with no trusted_roots field
- asserts trust_auto_resolve=true (config roots matched the CWD)

81 tool tests passing, 0 failing.
2026-04-08 03:08:13 +09:00
YeonGyu-Kim
bcdc52d72c feat(config): add trustedRoots to RuntimeConfig
Closes the startup-friction gap filed in ROADMAP (dd97c49).

WorkerCreate required trusted_roots on every call with no config-level
default. Any batch script that omitted the field stalled all workers at
TrustRequired with no auto-recovery path.

Changes:
- RuntimeFeatureConfig: add trusted_roots: Vec<String> field
- ConfigLoader: wire parse_optional_trusted_roots() for 'trustedRoots' key
- RuntimeConfig / RuntimeFeatureConfig: expose trusted_roots() accessor
- config_validate: add trustedRoots to TOP_LEVEL_FIELDS schema (StringArray)
- Tests: parses_trusted_roots_from_settings + trusted_roots_default_is_empty_when_unset

Callers can now set trusted_roots in .claw/settings.json:
  { "trustedRoots": ["/tmp/worktrees"] }

WorkerRegistry::spawn_worker() callers should merge config.trusted_roots()
with any per-call overrides (wiring left for follow-up).
2026-04-08 02:35:19 +09:00
YeonGyu-Kim
dd97c49e6b docs(roadmap): file startup-friction gap — no default trusted_roots in settings
WorkerCreate requires trusted_roots per-call; no config-level default.
Any batch that forgets the field stalls all workers at trust_required.
Root cause of several 'batch lanes not advancing' incidents.

Recommended fix: wire RuntimeConfig::trusted_roots() as default into
WorkerRegistry::spawn_worker(), with per-call overrides. Update
config_validate schema to include the new field.
2026-04-08 02:02:48 +09:00
YeonGyu-Kim
5dfb1d7c2b fix(config_validate): add missing aliases/providerFallbacks to schema; fix deprecated-key bypass
Two real schema gaps found via dogfood (cargo test -p runtime):

1. aliases and providerFallbacks not in TOP_LEVEL_FIELDS
   - Both are valid config keys parsed by config.rs
   - Validator was rejecting them as unknown keys
   - 2 tests failing: parses_user_defined_model_aliases,
     parses_provider_fallbacks_chain

2. Deprecated keys were being flagged as unknown before the deprecated
   check ran (unknown-key check runs first in validate_object_keys)
   - Added early-exit for deprecated keys in unknown-key loop
   - Keeps deprecated→warning behavior for permissionMode/enabledPlugins
     which still appear in valid legacy configs

3. Config integration tests had assertions on format strings that never
   matched the actual validator output (path:3: vs path: ... (line N))
   - Updated assertions to check for path + line + field name as
     independent substrings instead of a format that was never produced

426 tests passing, 0 failing.
2026-04-08 01:45:08 +09:00
YeonGyu-Kim
fcb5d0c16a fix(worker_boot): add seconds_since_update to state snapshot
Clawhip needs to distinguish a stalled trust_required worker from one
that just transitioned. Without a pre-computed staleness field it has
to compute epoch delta itself from updated_at.

seconds_since_update = now - updated_at at snapshot write time.
Clawhip threshold: > 60s in trust_required = stalled; act.
2026-04-08 01:03:00 +09:00
YeonGyu-Kim
314f0c99fd feat(worker_boot): emit .claw/worker-state.json on every status transition
WorkerStatus is fully tracked in worker_boot.rs but was invisible to
external observers (clawhip, orchestrators) because opencode serve's
HTTP server is upstream and not ours to extend.

Solution: atomic file-based observability.

- emit_state_file() writes .claw/worker-state.json on every push_event()
  call (tmp write + rename for atomicity)
- Snapshot includes: worker_id, status, is_ready, trust_gate_cleared,
  prompt_in_flight, last_event, updated_at
- Add 'claw state' CLI subcommand to read and print the file
- Add regression test: emit_state_file_writes_worker_status_on_transition
  verifies spawning→ready_for_prompt transition is reflected on disk

This closes the /state dogfood gap without requiring any upstream
opencode changes. Clawhip can now distinguish a truly stalled worker
(status: trust_required or running with no recent updated_at) from a
quiet-but-progressing one.
2026-04-08 00:37:44 +09:00
YeonGyu-Kim
469ae0179e docs(roadmap): document WorkerState deployment architecture gap
WorkerStatus state machine exists in worker_boot.rs and is exported
from runtime/src/lib.rs. But claw-code is a plugin — it cannot add
HTTP routes to opencode serve (upstream binary, not ours).

/state HTTP endpoint via axum was never implemented. Prior session
summary claiming commit 0984cca was incorrect.

Recommended path: write WorkerStatus transitions to
.claw/worker-state.json on each transition (file-based observability,
no upstream changes required). Wire WorkerRegistry::transition() to
atomic file writes + add  CLI subcommand.
2026-04-08 00:07:06 +09:00
YeonGyu-Kim
092d8b6e21 fix(tests): add missing test imports for session/prompt history features
Add missing imports to test module:
- PromptHistoryEntry, render_prompt_history_report, parse_history_count
- parse_export_args, render_session_markdown
- summarize_tool_payload_for_markdown, short_tool_id

Fixes test compilation errors introduced by new session and export
features from batch 5/6 work.
2026-04-07 16:20:33 +09:00
YeonGyu-Kim
b3ccd92d24 feat: b6-pdf-extract-v2 follow-up work — batch 6 2026-04-07 16:11:51 +09:00
YeonGyu-Kim
d71d109522 feat: b6-openai-models follow-up work — batch 6 2026-04-07 16:11:51 +09:00
YeonGyu-Kim
0f2f02af2d feat: b6-http-proxy-v2 follow-up work — batch 6 2026-04-07 16:11:51 +09:00
YeonGyu-Kim
e51566c745 feat: b6-bridge-directory follow-up work — batch 6 2026-04-07 16:11:50 +09:00
YeonGyu-Kim
20f3a5932a fix(cli): wire sessions_dir() through SessionStore::from_cwd() (#41)
The CLI was using a flat cwd/.claw/sessions/ path without workspace
fingerprinting, while SessionStore::from_cwd() adds a hash subdirectory.
This mismatch meant the isolation machinery existed but wasn't actually
used by the main session management codepath.

Now sessions_dir() delegates to SessionStore::from_cwd(), ensuring all
session operations use workspace-fingerprinted directories.
2026-04-07 16:03:44 +09:00
YeonGyu-Kim
28e6cc0965 feat(runtime): activate per-worktree session isolation (#41)
Remove #[cfg(test)] gate from session_control module — SessionStore
is now available at runtime, not just in tests. Export SessionStore and
add workspace_sessions_dir() helper that creates fingerprinted session
directories per workspace root.

This is the #41 kill shot: parallel opencode serve instances will use
separate session namespaces based on workspace fingerprint instead of
sharing a global ~/.local/share/opencode/ store.

The CLI already uses cwd/.claw/sessions/ (sessions_dir()), and now
SessionStore::from_cwd() adds workspace hash isolation on top.
2026-04-07 16:00:57 +09:00
YeonGyu-Kim
f03b8dce17 feat: bridge directory metadata + stale-base preflight check
- Add CWD to SSE session events (kills Directory: unknown)
- Add stale-base preflight: verify HEAD matches expected base commit
- Warn on divergence before session starts
2026-04-07 15:55:38 +09:00
YeonGyu-Kim
ecdca49552 feat: plugin-level max_output_tokens override via session_control 2026-04-07 15:55:38 +09:00
YeonGyu-Kim
8cddbc6615 feat: b6-sterling-deep — batch 6 2026-04-07 15:52:31 +09:00
YeonGyu-Kim
5c276c8e14 feat: b6-pdf-extract-v2 — batch 6 2026-04-07 15:52:30 +09:00
YeonGyu-Kim
1f968b359f feat: b6-openai-models — batch 6 2026-04-07 15:52:30 +09:00
YeonGyu-Kim
18d3c1918b feat: b6-http-proxy-v2 — batch 6 2026-04-07 15:52:30 +09:00
YeonGyu-Kim
8a4b613c39 feat: b6-codex-session — batch 6 2026-04-07 15:52:30 +09:00
YeonGyu-Kim
82f2e8e92b feat: doctor-cmd implementation 2026-04-07 15:28:43 +09:00
YeonGyu-Kim
8f4651a096 fix: resolve git_context field references after cherry-pick merge 2026-04-07 15:20:20 +09:00
YeonGyu-Kim
dab16c230a feat: b5-session-export — batch 5 wave 2 2026-04-07 15:19:45 +09:00
YeonGyu-Kim
a46711779c feat: b5-markdown-fence — batch 5 wave 2 2026-04-07 15:19:45 +09:00
YeonGyu-Kim
ef0b870890 feat: b5-git-aware — batch 5 wave 2 2026-04-07 15:19:45 +09:00
YeonGyu-Kim
4557a81d2f feat: b5-doctor-cmd — batch 5 wave 2 2026-04-07 15:19:45 +09:00
YeonGyu-Kim
86c3667836 feat: b5-context-compress — batch 5 wave 2 2026-04-07 15:19:45 +09:00
YeonGyu-Kim
260bac321f feat: b5-config-validate — batch 5 wave 2 2026-04-07 15:19:44 +09:00
YeonGyu-Kim
133ed4581e feat(config): add config file validation with clear error messages
Parse TOML/JSON config on startup, emit errors for unknown keys, wrong
types, deprecated fields with exact line and field name.
2026-04-07 15:10:08 +09:00
YeonGyu-Kim
8663751650 fix: resolve merge conflicts from batch 5 cherry-picks (compact field, run_turn_with_output arity) 2026-04-07 14:53:46 +09:00
YeonGyu-Kim
90f2461f75 feat: b5-tool-timeout — batch 5 upstream parity 2026-04-07 14:51:32 +09:00