roadmap: #246 filed

This commit is contained in:
Yeachan-Heo
2026-04-26 00:31:28 +00:00
committed by YeonGyu-Kim
parent 8e9ba9234a
commit 2d4806c163

View File

@@ -16634,3 +16634,11 @@ Filed 2026-04-26 09:10 KST from Sigrid Jin's day-2 claw-code field report in `#c
Observed/claimed shape from field report plus repo grep: local search tooling is centered around the existing client-side `WebSearch` path rather than a pluggable `SearchProvider` abstraction. There is no first-class `.config/searchProvider.json` parser registry, no `.claw/settings.json` search-provider section, no typed provider enum/config for `ddg | tavily | brave | firecrawl`, no per-provider parser contract, no provider capability flags (HTML scrape vs JSON API vs crawl/extract), no parser-version/provenance field on search results, and no health/fallback telemetry that says `DDG parser stale, falling back to Brave/Tavily/Firecrawl`. This is distinct from #233: #233 covers SERVER-MANAGED provider-native web-search-with-citations (`web_search_20250305`, OpenAI Responses web_search, encrypted/citation tool results). #245 covers the CLIENT-SIDE local WebSearch tool shadow that claw-code already has, and makes it provider-pluggable/config-driven instead of DuckDuckGo-parser-coupled.
Required fix shape: (a) introduce a `SearchProvider` config model in `.claw/settings.json` with selected provider, credentials/env binding, timeout/rate-limit, and fallback order; (b) support at least `ddg`, `tavily`, `brave`, and `firecrawl` as provider backends; (c) move provider parser/extractor rules into a loadable registry such as `.config/searchProvider.json` or a versioned bundled registry with user overrides; (d) define a normalized `SearchResult` contract with title/url/snippet/source/provider/parser_version/raw_score plus optional extracted content/citation metadata; (e) emit typed telemetry/events for provider selected, parser version used, parser failure, fallback taken, and zero-result-vs-parser-break distinction; (f) add tests with frozen fixtures per provider so search quality regressions do not silently look like “no results.” Acceptance: users can switch DDG→Tavily/Brave/Firecrawl without code changes, parser updates can ship as config/registry changes, and a DOM/API drift produces an observable parser/fallback event instead of silent degraded search. **Status:** Open. No source code changed. Filed as ROADMAP-only pinpoint from Sigrid Jin's direct field report. Cluster delta: client-side-websearch-tool-shadow +1, provider-configurability +1, parser-externalization cluster founded, search-quality/fallback-observability cluster founded; linked to #233 as the client-side configurable complement to server-managed web-search-with-citations.
## Pinpoint #246 — Provider credentials and base URLs are env/dotenv-first instead of a typed settings.json provider-auth registry
Dogfooded 2026-04-26 09:30 KST on `feat/jobdori-168c-emission-routing`, following Sigrid Jin's field report that the next local fork change is to remove environment-variable dependency and move configuration fully into `settings.json`. Repo grep confirms provider construction is still centered on `from_env` / `read_env_non_empty` paths: Anthropic reads `ANTHROPIC_API_KEY`, `ANTHROPIC_AUTH_TOKEN`, `ANTHROPIC_BASE_URL`; OpenAI-compatible providers read `OPENAI_API_KEY`, `OPENAI_BASE_URL`, `XAI_API_KEY`, `XAI_BASE_URL`, `DASHSCOPE_API_KEY`, `DASHSCOPE_BASE_URL`; dotenv fallback exists, but there is no typed `.claw/settings.json` provider-auth registry that can declare provider credentials, base URLs, auth source precedence, per-provider capabilities, or secret reference indirection in one auditable config surface.
This is distinct from #245: #245 externalizes client-side WebSearch provider/parser selection (`ddg | tavily | brave | firecrawl`). #246 covers the broader provider-auth/config substrate that #245 would need for Brave/Tavily/Firecrawl keys and that every model provider already depends on. Today a user has to reason about process env, shell persistence, dotenv discovery, saved OAuth behavior, and provider-specific env var names. That creates startup friction, invisible config provenance, and poor portability across terminals, cron jobs, tmux sessions, GUI launches, and forked local workflows. It also makes bug reports harder: `doctor` can say an env var is missing, but cannot show a redacted settings-derived provider registry or explain why env beat settings or settings beat dotenv because no such typed precedence model exists.
Required fix shape: (a) add a typed provider configuration section in `.claw/settings.json` such as `providers.<name>.apiKey`, `authToken`, `baseUrl`, `source`, `enabled`, `capabilities`, and `secretRef`; (b) define deterministic precedence across CLI flag, settings, env, dotenv, and saved OAuth, with provenance surfaced in `status`/`doctor` JSON and text output; (c) support redacted display and validation without leaking secret values; (d) allow search-provider credentials from #245 to use the same registry rather than introducing a separate ad-hoc key path; (e) emit config-load telemetry for selected auth source, missing/empty secret, invalid base URL, and fallback taken; (f) add migration guidance/tests proving env-only setups still work while settings-first setups require no shell exports. Acceptance: a fresh user can configure Anthropic/OpenAI/xAI/DashScope/Brave/Tavily/Firecrawl entirely through settings.json (or secret refs) and `claw doctor --json` can explain exactly which provider config was used, where it came from, and why, without depending on terminal-specific environment state. **Status:** Open. No source code changed. Filed as ROADMAP-only dogfood pinpoint from the 2026-04-26 00:30 UTC nudge. Cluster delta: startup-friction +1, config-provenance +1, settings-first-provider-auth cluster founded, env/dotenv-precedence-observability cluster founded; linked to #245 because pluggable search providers require the same settings-backed credential substrate.