mirror of
https://github.com/instructkr/claw-code.git
synced 2026-04-08 00:54:49 +08:00
Block oversized requests before providers hard-fail
The runtime already tracked rough token estimates for compaction, but provider-bound requests still relied on naive model output limits and could be sent upstream even when the selected model could not fit the estimated prompt plus requested output. This adds a small model token/context registry in the API layer, estimates request size from the serialized prompt payload, and fails locally with a dedicated context-window error before Anthropic or xAI calls are made. Focused integration coverage asserts the preflight fires before any HTTP request leaves the process. Constraint: Keep the first pass minimal and reusable across both Anthropic and OpenAI-compatible providers Rejected: Auto-compact-and-retry in the same patch | broader control-flow change than the requested minimal preflight Confidence: medium Scope-risk: narrow Reversibility: clean Directive: Expand the model registry before enabling preflight for additional providers or aliases Tested: cargo build -p api -p tools -p rusty-claude-cli; cargo test -p api Not-tested: End-to-end CLI auto-compaction or retry behavior after a local context_window_blocked failure
This commit is contained in:
@@ -14,7 +14,7 @@ use telemetry::{AnalyticsEvent, AnthropicRequestProfile, ClientIdentity, Session
|
||||
use crate::error::ApiError;
|
||||
use crate::prompt_cache::{PromptCache, PromptCacheRecord, PromptCacheStats};
|
||||
|
||||
use super::{Provider, ProviderFuture};
|
||||
use super::{preflight_message_request, Provider, ProviderFuture};
|
||||
use crate::sse::SseParser;
|
||||
use crate::types::{MessageDeltaEvent, MessageRequest, MessageResponse, StreamEvent, Usage};
|
||||
|
||||
@@ -294,6 +294,8 @@ impl AnthropicClient {
|
||||
}
|
||||
}
|
||||
|
||||
preflight_message_request(&request)?;
|
||||
|
||||
let response = self.send_with_retry(&request).await?;
|
||||
let request_id = request_id_from_headers(response.headers());
|
||||
let mut response = response
|
||||
@@ -337,6 +339,7 @@ impl AnthropicClient {
|
||||
&self,
|
||||
request: &MessageRequest,
|
||||
) -> Result<MessageStream, ApiError> {
|
||||
preflight_message_request(request)?;
|
||||
let response = self
|
||||
.send_with_retry(&request.clone().with_streaming())
|
||||
.await?;
|
||||
|
||||
Reference in New Issue
Block a user