claw-code/PARITY.md at c2f1304a01718c46e23e10888a105f76a9a136cd

Yeachan-Heo c2f1304a01 Lock down CLI-to-mock behavioral parity for Anthropic flows

This adds a deterministic mock Anthropic-compatible /v1/messages service,
a clean-environment CLI harness, and repo docs so the first parity
milestone can be validated without live network dependencies.

Constraint: First milestone must prove Rust claw can connect from a clean environment and cover streaming, tool assembly, and permission/tool flow
Constraint: No new third-party dependencies; reuse the existing Rust workspace stack
Rejected: Record/replay live Anthropic traffic | nondeterministic and unsuitable for repeatable CI coverage
Confidence: high
Scope-risk: moderate
Reversibility: clean
Directive: Keep scenario markers and expected tool payload shapes synchronized between the mock service and the harness tests
Tested: cargo fmt --all
Tested: cargo clippy --workspace --all-targets -- -D warnings
Tested: cargo test --workspace
Tested: ./scripts/run_mock_parity_harness.sh
Not-tested: Live Anthropic responses beyond the five scripted harness scenarios

Tool	Rust Impl	Behavioral Notes
bash	`runtime::bash` 283 LOC	subprocess exec, timeout, background, sandbox — strong parity. Missing: sedValidation, pathValidation, readOnlyValidation, destructiveCommandWarning, commandSemantics (upstream has 18 submodules for bash alone)
read_file	`runtime::file_ops`	offset/limit read — good parity
write_file	`runtime::file_ops`	file create/overwrite — good parity
edit_file	`runtime::file_ops`	old/new string replacement — good parity. Missing: replace_all was recently added
glob_search	`runtime::file_ops`	glob pattern matching — good parity
grep_search	`runtime::file_ops`	ripgrep-style search — good parity
WebFetch	`tools`	URL fetch + content extraction — moderate parity (need to verify content truncation, redirect handling vs upstream)
WebSearch	`tools`	search query execution — moderate parity
TodoWrite	`tools`	todo/note persistence — moderate parity
Skill	`tools`	skill discovery/install — moderate parity
Agent	`tools`	agent delegation — moderate parity
ToolSearch	`tools`	tool discovery — good parity
NotebookEdit	`tools`	jupyter notebook cell editing — moderate parity
Sleep	`tools`	delay execution — good parity
SendUserMessage/Brief	`tools`	user-facing message — good parity
Config	`tools`	config inspection — moderate parity
EnterPlanMode	`tools`	worktree plan mode toggle — good parity
ExitPlanMode	`tools`	worktree plan mode restore — good parity
StructuredOutput	`tools`	passthrough JSON — good parity
REPL	`tools`	subprocess code execution — moderate parity
PowerShell	`tools`	Windows PowerShell execution — moderate parity

Tool	Status	Notes
AskUserQuestion	stub	needs user I/O integration
TaskCreate	stub	needs sub-agent runtime
TaskGet	stub	needs task registry
TaskList	stub	needs task registry
TaskStop	stub	needs process management
TaskUpdate	stub	needs task message passing
TaskOutput	stub	needs output capture
TeamCreate	stub	needs parallel task orchestration
TeamDelete	stub	needs team lifecycle
CronCreate	stub	needs scheduler runtime
CronDelete	stub	needs cron registry
CronList	stub	needs cron registry
LSP	stub	needs language server client
ListMcpResources	stub	needs MCP client
ReadMcpResource	stub	needs MCP client
McpAuth	stub	needs OAuth flow
MCP	stub	needs MCP tool proxy
RemoteTrigger	stub	needs HTTP client
TestingPermission	stub	test-only, low priority

5.4 KiB

Raw Blame History

Parity Status — claw-code Rust Port

Mock parity harness — milestone 1

Tool Surface: 40/40 (spec parity)

Real Implementations (behavioral parity — varying depth)

Stubs Only (surface parity, no behavior)

Slash Commands: 67/141 upstream entries

Missing Behavioral Features (in existing real tools)

Runtime Behavioral Gaps

Migration Readiness

5.4 KiB Raw Blame History

Parity Status — claw-code Rust Port

Mock parity harness — milestone 1

Tool Surface: 40/40 (spec parity)

Real Implementations (behavioral parity — varying depth)

Stubs Only (surface parity, no behavior)

Slash Commands: 67/141 upstream entries

Missing Behavioral Features (in existing real tools)

Runtime Behavioral Gaps

Migration Readiness

5.4 KiB

Raw Blame History