mirror of
https://github.com/instructkr/claw-code.git
synced 2026-04-13 11:34:51 +08:00
Keep completed lanes from ending on mushy stop summaries
The next repo-local sweep target was ROADMAP #69: completed lane runs could persist vague control text like “commit push everyting, keep sweeping $ralph”, which made downstream stop summaries operationally useless. The fix adds a lane-finished quality floor that preserves strong summaries, rewrites empty/control-only/too- short-without-context summaries into a contextual fallback, and records structured metadata explaining when the fallback fired. Constraint: Keep legitimate concise lane summaries intact while improving only low-signal completions Rejected: Blanket-rewrite every completed summary into a templated sentence | would erase useful model-authored detail from good lane outputs Confidence: high Scope-risk: narrow Reversibility: clean Directive: If lane-finished summary heuristics change later, update the structured `qualityFloorApplied/rawSummary/reasons/wordCount` contract and its regression tests together Tested: cargo fmt --all --check; cargo clippy --workspace --all-targets -- -D warnings; cargo test --workspace; architect review APPROVE Not-tested: External OMX consumers that may still ignore the new lane.finished data payload
This commit is contained in:
@@ -496,7 +496,7 @@ Model name prefix now wins unconditionally over env-var presence. Regression tes
|
|||||||
|
|
||||||
62. **Worker state file surface not implemented** — **done (verified 2026-04-12):** current `main` already wires `emit_state_file(worker)` into the worker transition path in `rust/crates/runtime/src/worker_boot.rs`, atomically writes `.claw/worker-state.json`, and exposes the documented reader surface through `claw state` / `claw state --output-format json` in `rust/crates/rusty-claude-cli/src/main.rs`. Fresh proof exists in `runtime` regression `emit_state_file_writes_worker_status_on_transition`, the end-to-end `tools` regression `recovery_loop_state_file_reflects_transitions`, and direct CLI parsing coverage for `state` / `state --output-format json`. Source: Jobdori dogfood.
|
62. **Worker state file surface not implemented** — **done (verified 2026-04-12):** current `main` already wires `emit_state_file(worker)` into the worker transition path in `rust/crates/runtime/src/worker_boot.rs`, atomically writes `.claw/worker-state.json`, and exposes the documented reader surface through `claw state` / `claw state --output-format json` in `rust/crates/rusty-claude-cli/src/main.rs`. Fresh proof exists in `runtime` regression `emit_state_file_writes_worker_status_on_transition`, the end-to-end `tools` regression `recovery_loop_state_file_reflects_transitions`, and direct CLI parsing coverage for `state` / `state --output-format json`. Source: Jobdori dogfood.
|
||||||
|
|
||||||
**Scope note (verified 2026-04-12):** ROADMAP #31, #43, and #63-#68 currently appear to describe acpx/droid or upstream OMX/server orchestration behavior, not claw-code source already present in this repository. Repo-local searches for `acpx`, `use-droid`, `run-acpx`, `commit-wrapper`, `ultraclaw`, `roadmap-nudge-10min`, `OMX_TMUX_INJECT`, `/hooks/health`, and `/hooks/status` found no implementation hits outside `ROADMAP.md`, and the earlier state-surface note already records that the HTTP server is not owned by claw-code. With #45 now fixed, the remaining unresolved items in this section look like external tracking notes rather than confirmed repo-local backlog; re-check if new repo-local evidence appears.
|
**Scope note (verified 2026-04-12):** ROADMAP #31, #43, and #63-#68 currently appear to describe acpx/droid or upstream OMX/server orchestration behavior, not claw-code source already present in this repository. Repo-local searches for `acpx`, `use-droid`, `run-acpx`, `commit-wrapper`, `ultraclaw`, `roadmap-nudge-10min`, `OMX_TMUX_INJECT`, `/hooks/health`, and `/hooks/status` found no implementation hits outside `ROADMAP.md`, and the earlier state-surface note already records that the HTTP server is not owned by claw-code. With #45 and #69 now fixed, the remaining unresolved items in this section look like external tracking notes rather than confirmed repo-local backlog; re-check if new repo-local evidence appears.
|
||||||
|
|
||||||
63. **Droid session completion semantics broken: code arrives after "status: completed"** — dogfooded 2026-04-12. Ultraclaw droid sessions (use-droid via acpx) report `session.status: completed` before file writes are fully flushed/synced to the working tree. Discovered +410 lines of "late-arriving" droid output that appeared after I had already assessed 8 sessions as "no code produced." This creates false-negative assessments and duplicate work. **Fix shape:** (a) droid agent should only report completion after explicit file-write confirmation (fsync or existence check); (b) or, claw-code should expose a `pending_writes` status that indicates "agent responded, disk flush pending"; (c) lane orchestrators should poll for file changes for N seconds after completion before final assessment. **Blocker:** none. Source: Jobdori ultraclaw dogfood 2026-04-12.
|
63. **Droid session completion semantics broken: code arrives after "status: completed"** — dogfooded 2026-04-12. Ultraclaw droid sessions (use-droid via acpx) report `session.status: completed` before file writes are fully flushed/synced to the working tree. Discovered +410 lines of "late-arriving" droid output that appeared after I had already assessed 8 sessions as "no code produced." This creates false-negative assessments and duplicate work. **Fix shape:** (a) droid agent should only report completion after explicit file-write confirmation (fsync or existence check); (b) or, claw-code should expose a `pending_writes` status that indicates "agent responded, disk flush pending"; (c) lane orchestrators should poll for file changes for N seconds after completion before final assessment. **Blocker:** none. Source: Jobdori ultraclaw dogfood 2026-04-12.
|
||||||
|
|
||||||
@@ -510,6 +510,6 @@ Model name prefix now wins unconditionally over env-var presence. Regression tes
|
|||||||
|
|
||||||
68. **Internal reinjection/resume paths leak opaque control prose** — dogfooded 2026-04-12. OMX lanes stopping with `Continue from current mode state. [OMX_TMUX_INJECT]` expose internal implementation details instead of operator-meaningful state. The event tells us *that* tmux reinjection happened, but not *why* (retry after failure? resume after idle? manual recovery?), *what state was preserved*, or *what the lane was trying to do*. **Fix shape:** recovery/reinject events should emit structured cause like: `resume_after_stop`, `retry_after_tool_failure`, `tmux_reinject_after_idle`, `manual_recovery` plus preserved state / target lane info. Never leak bare internal markers like `[OMX_TMUX_INJECT]` as the primary summary. Blocker: none. Source: gaebal-gajae dogfood analysis 2026-04-12.
|
68. **Internal reinjection/resume paths leak opaque control prose** — dogfooded 2026-04-12. OMX lanes stopping with `Continue from current mode state. [OMX_TMUX_INJECT]` expose internal implementation details instead of operator-meaningful state. The event tells us *that* tmux reinjection happened, but not *why* (retry after failure? resume after idle? manual recovery?), *what state was preserved*, or *what the lane was trying to do*. **Fix shape:** recovery/reinject events should emit structured cause like: `resume_after_stop`, `retry_after_tool_failure`, `tmux_reinject_after_idle`, `manual_recovery` plus preserved state / target lane info. Never leak bare internal markers like `[OMX_TMUX_INJECT]` as the primary summary. Blocker: none. Source: gaebal-gajae dogfood analysis 2026-04-12.
|
||||||
|
|
||||||
69. **Lane stop summaries have no minimum quality floor** — dogfooded 2026-04-12. `clawcode-human` session stopped with summary `commit push everyting, keep sweeping $ralph` — vague, typo-ridden, operationally useless. Unlike well-scoped review lanes, this summary regressed to mushy command prose with no outcome clarity. **Fix shape:** (a) enforce minimum stop/result summary standards: what was done (outcome), what was scoped (target), what's next (state); (b) typo/grammar validation; (c) reject summaries that are shorter than N words or contain only control verbs without context. Blocker: none. Source: gaebal-gajae dogfood analysis 2026-04-12.
|
69. **Lane stop summaries have no minimum quality floor** — **done (verified 2026-04-12):** completed lane persistence in `rust/crates/tools/src/lib.rs` now normalizes vague/control-only stop summaries into a contextual fallback that includes the lane target and status, while preserving structured metadata about whether the quality floor fired (`qualityFloorApplied`, `rawSummary`, `reasons`, `wordCount`). Regression coverage locks both the pass-through path for good summaries and the fallback path for mushy summaries like `commit push everyting, keep sweeping $ralph`. **Original filing below.**
|
||||||
|
|
||||||
70. **Install-source ambiguity misleads real users** — community observation 2026-04-12. User treated `claw-code.io` as official, then hit `clawcode` vs deprecated `claw-code` naming collision and concluded install story was inconsistent. Source-of-truth is not obvious when website/repo/crates naming diverges. **Fix shape:** canonical repo docs should explicitly state which site is official; installation guidance should visibly warn against deprecated `claw-code` crate and ambiguous third-party pages. Blocker: none. Source: gaebal-gajae community watch 2026-04-12.
|
70. **Install-source ambiguity misleads real users** — community observation 2026-04-12. User treated `claw-code.io` as official, then hit `clawcode` vs deprecated `claw-code` naming collision and concluded install story was inconsistent. Source-of-truth is not obvious when website/repo/crates naming diverges. **Fix shape:** canonical repo docs should explicitly state which site is official; installation guidance should visibly warn against deprecated `claw-code` crate and ambiguous third-party pages. Blocker: none. Source: gaebal-gajae community watch 2026-04-12.
|
||||||
@@ -3743,12 +3743,13 @@ fn persist_agent_terminal_state(
|
|||||||
.push(LaneEvent::failed(iso8601_now(), &blocker));
|
.push(LaneEvent::failed(iso8601_now(), &blocker));
|
||||||
} else {
|
} else {
|
||||||
next_manifest.current_blocker = None;
|
next_manifest.current_blocker = None;
|
||||||
let compressed_detail = result
|
let finished_summary = build_lane_finished_summary(&next_manifest, result);
|
||||||
.filter(|value| !value.trim().is_empty())
|
next_manifest.lane_events.push(
|
||||||
.map(|value| compress_summary_text(value.trim()));
|
LaneEvent::finished(iso8601_now(), finished_summary.detail).with_data(
|
||||||
next_manifest
|
serde_json::to_value(&finished_summary.data)
|
||||||
.lane_events
|
.expect("lane summary metadata should serialize"),
|
||||||
.push(LaneEvent::finished(iso8601_now(), compressed_detail));
|
),
|
||||||
|
);
|
||||||
if let Some(provenance) = maybe_commit_provenance(result) {
|
if let Some(provenance) = maybe_commit_provenance(result) {
|
||||||
next_manifest.lane_events.push(LaneEvent::commit_created(
|
next_manifest.lane_events.push(LaneEvent::commit_created(
|
||||||
iso8601_now(),
|
iso8601_now(),
|
||||||
@@ -3760,6 +3761,152 @@ fn persist_agent_terminal_state(
|
|||||||
write_agent_manifest(&next_manifest)
|
write_agent_manifest(&next_manifest)
|
||||||
}
|
}
|
||||||
|
|
||||||
|
const MIN_LANE_SUMMARY_WORDS: usize = 7;
|
||||||
|
const CONTROL_ONLY_SUMMARY_WORDS: &[&str] = &[
|
||||||
|
"ack",
|
||||||
|
"commit",
|
||||||
|
"continue",
|
||||||
|
"everyting",
|
||||||
|
"everything",
|
||||||
|
"keep",
|
||||||
|
"next",
|
||||||
|
"push",
|
||||||
|
"ralph",
|
||||||
|
"resume",
|
||||||
|
"retry",
|
||||||
|
"run",
|
||||||
|
"stop",
|
||||||
|
"sweep",
|
||||||
|
"sweeping",
|
||||||
|
"team",
|
||||||
|
];
|
||||||
|
const CONTEXTUAL_SUMMARY_WORDS: &[&str] = &[
|
||||||
|
"added",
|
||||||
|
"audited",
|
||||||
|
"blocked",
|
||||||
|
"completed",
|
||||||
|
"documented",
|
||||||
|
"failed",
|
||||||
|
"finished",
|
||||||
|
"fixed",
|
||||||
|
"implemented",
|
||||||
|
"investigated",
|
||||||
|
"merged",
|
||||||
|
"pushed",
|
||||||
|
"refactored",
|
||||||
|
"removed",
|
||||||
|
"reviewed",
|
||||||
|
"tested",
|
||||||
|
"updated",
|
||||||
|
"verified",
|
||||||
|
];
|
||||||
|
|
||||||
|
#[derive(Debug, Clone, Serialize)]
|
||||||
|
struct LaneFinishedSummaryData {
|
||||||
|
#[serde(rename = "qualityFloorApplied")]
|
||||||
|
quality_floor_applied: bool,
|
||||||
|
reasons: Vec<String>,
|
||||||
|
#[serde(rename = "rawSummary", skip_serializing_if = "Option::is_none")]
|
||||||
|
raw_summary: Option<String>,
|
||||||
|
#[serde(rename = "wordCount")]
|
||||||
|
word_count: usize,
|
||||||
|
}
|
||||||
|
|
||||||
|
#[derive(Debug, Clone)]
|
||||||
|
struct LaneFinishedSummary {
|
||||||
|
detail: Option<String>,
|
||||||
|
data: LaneFinishedSummaryData,
|
||||||
|
}
|
||||||
|
|
||||||
|
#[derive(Debug)]
|
||||||
|
struct LaneSummaryAssessment {
|
||||||
|
apply_quality_floor: bool,
|
||||||
|
reasons: Vec<String>,
|
||||||
|
word_count: usize,
|
||||||
|
}
|
||||||
|
|
||||||
|
fn build_lane_finished_summary(
|
||||||
|
manifest: &AgentOutput,
|
||||||
|
result: Option<&str>,
|
||||||
|
) -> LaneFinishedSummary {
|
||||||
|
let raw_summary = result.map(str::trim).filter(|value| !value.is_empty());
|
||||||
|
let assessment = assess_lane_summary_quality(raw_summary.unwrap_or_default());
|
||||||
|
let detail = match raw_summary {
|
||||||
|
Some(summary) if !assessment.apply_quality_floor => Some(compress_summary_text(summary)),
|
||||||
|
Some(summary) => Some(compose_lane_summary_fallback(manifest, Some(summary))),
|
||||||
|
None => Some(compose_lane_summary_fallback(manifest, None)),
|
||||||
|
};
|
||||||
|
|
||||||
|
LaneFinishedSummary {
|
||||||
|
detail,
|
||||||
|
data: LaneFinishedSummaryData {
|
||||||
|
quality_floor_applied: raw_summary.is_none() || assessment.apply_quality_floor,
|
||||||
|
reasons: assessment.reasons,
|
||||||
|
raw_summary: raw_summary.map(str::to_string),
|
||||||
|
word_count: assessment.word_count,
|
||||||
|
},
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
fn assess_lane_summary_quality(summary: &str) -> LaneSummaryAssessment {
|
||||||
|
let words = summary
|
||||||
|
.split(|ch: char| !(ch.is_ascii_alphanumeric() || ch == '-' || ch == '#'))
|
||||||
|
.filter(|token| !token.is_empty())
|
||||||
|
.map(str::to_ascii_lowercase)
|
||||||
|
.collect::<Vec<_>>();
|
||||||
|
|
||||||
|
let word_count = words.len();
|
||||||
|
let mut reasons = Vec::new();
|
||||||
|
if summary.trim().is_empty() {
|
||||||
|
reasons.push(String::from("empty"));
|
||||||
|
}
|
||||||
|
|
||||||
|
let control_only = !words.is_empty()
|
||||||
|
&& words
|
||||||
|
.iter()
|
||||||
|
.all(|word| CONTROL_ONLY_SUMMARY_WORDS.contains(&word.as_str()));
|
||||||
|
if control_only {
|
||||||
|
reasons.push(String::from("control_only"));
|
||||||
|
}
|
||||||
|
|
||||||
|
let has_context_signal = summary.contains('`')
|
||||||
|
|| summary.contains('/')
|
||||||
|
|| summary.contains(':')
|
||||||
|
|| summary.contains('#')
|
||||||
|
|| words
|
||||||
|
.iter()
|
||||||
|
.any(|word| CONTEXTUAL_SUMMARY_WORDS.contains(&word.as_str()));
|
||||||
|
if word_count < MIN_LANE_SUMMARY_WORDS && !has_context_signal {
|
||||||
|
reasons.push(String::from("too_short_without_context"));
|
||||||
|
}
|
||||||
|
|
||||||
|
LaneSummaryAssessment {
|
||||||
|
apply_quality_floor: !reasons.is_empty(),
|
||||||
|
reasons,
|
||||||
|
word_count,
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
fn compose_lane_summary_fallback(manifest: &AgentOutput, raw_summary: Option<&str>) -> String {
|
||||||
|
let target = manifest.description.trim();
|
||||||
|
let base = format!(
|
||||||
|
"Completed lane `{}` for target: {}. Status: completed.",
|
||||||
|
manifest.name,
|
||||||
|
if target.is_empty() {
|
||||||
|
"unspecified task"
|
||||||
|
} else {
|
||||||
|
target
|
||||||
|
}
|
||||||
|
);
|
||||||
|
match raw_summary {
|
||||||
|
Some(summary) => format!(
|
||||||
|
"{base} Original stop summary was too vague to keep as the lane result: \"{}\".",
|
||||||
|
summary.trim()
|
||||||
|
),
|
||||||
|
None => format!("{base} No usable stop summary was produced by the lane."),
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
fn derive_agent_state(
|
fn derive_agent_state(
|
||||||
status: &str,
|
status: &str,
|
||||||
result: Option<&str>,
|
result: Option<&str>,
|
||||||
@@ -7240,6 +7387,14 @@ mod tests {
|
|||||||
completed_manifest_json["laneEvents"][1]["event"],
|
completed_manifest_json["laneEvents"][1]["event"],
|
||||||
"lane.finished"
|
"lane.finished"
|
||||||
);
|
);
|
||||||
|
assert_eq!(
|
||||||
|
completed_manifest_json["laneEvents"][1]["data"]["qualityFloorApplied"],
|
||||||
|
false
|
||||||
|
);
|
||||||
|
assert_eq!(
|
||||||
|
completed_manifest_json["laneEvents"][1]["detail"],
|
||||||
|
"Finished successfully in commit abc1234"
|
||||||
|
);
|
||||||
assert_eq!(
|
assert_eq!(
|
||||||
completed_manifest_json["laneEvents"][2]["event"],
|
completed_manifest_json["laneEvents"][2]["event"],
|
||||||
"lane.commit.created"
|
"lane.commit.created"
|
||||||
@@ -7301,6 +7456,51 @@ mod tests {
|
|||||||
);
|
);
|
||||||
assert_eq!(failed_manifest_json["derivedState"], "truly_idle");
|
assert_eq!(failed_manifest_json["derivedState"], "truly_idle");
|
||||||
|
|
||||||
|
let normalized = execute_agent_with_spawn(
|
||||||
|
AgentInput {
|
||||||
|
description: "Sweep the next backlog item".to_string(),
|
||||||
|
prompt: "Produce a low-signal stop summary".to_string(),
|
||||||
|
subagent_type: Some("Explore".to_string()),
|
||||||
|
name: Some("summary-floor".to_string()),
|
||||||
|
model: None,
|
||||||
|
},
|
||||||
|
|job| {
|
||||||
|
persist_agent_terminal_state(
|
||||||
|
&job.manifest,
|
||||||
|
"completed",
|
||||||
|
Some("commit push everyting, keep sweeping $ralph"),
|
||||||
|
None,
|
||||||
|
)
|
||||||
|
},
|
||||||
|
)
|
||||||
|
.expect("normalized agent should succeed");
|
||||||
|
|
||||||
|
let normalized_manifest = std::fs::read_to_string(&normalized.manifest_file)
|
||||||
|
.expect("normalized manifest should exist");
|
||||||
|
let normalized_manifest_json: serde_json::Value =
|
||||||
|
serde_json::from_str(&normalized_manifest).expect("normalized manifest json");
|
||||||
|
assert_eq!(
|
||||||
|
normalized_manifest_json["laneEvents"][1]["event"],
|
||||||
|
"lane.finished"
|
||||||
|
);
|
||||||
|
let normalized_detail = normalized_manifest_json["laneEvents"][1]["detail"]
|
||||||
|
.as_str()
|
||||||
|
.expect("normalized detail");
|
||||||
|
assert!(normalized_detail.contains("Completed lane `summary-floor`"));
|
||||||
|
assert!(normalized_detail.contains("Sweep the next backlog item"));
|
||||||
|
assert_eq!(
|
||||||
|
normalized_manifest_json["laneEvents"][1]["data"]["qualityFloorApplied"],
|
||||||
|
true
|
||||||
|
);
|
||||||
|
assert_eq!(
|
||||||
|
normalized_manifest_json["laneEvents"][1]["data"]["rawSummary"],
|
||||||
|
"commit push everyting, keep sweeping $ralph"
|
||||||
|
);
|
||||||
|
assert_eq!(
|
||||||
|
normalized_manifest_json["laneEvents"][1]["data"]["reasons"][0],
|
||||||
|
"control_only"
|
||||||
|
);
|
||||||
|
|
||||||
let spawn_error = execute_agent_with_spawn(
|
let spawn_error = execute_agent_with_spawn(
|
||||||
AgentInput {
|
AgentInput {
|
||||||
description: "Spawn error task".to_string(),
|
description: "Spawn error task".to_string(),
|
||||||
|
|||||||
Reference in New Issue
Block a user