mirror of
https://github.com/instructkr/claw-code.git
synced 2026-04-08 00:54:49 +08:00
Block oversized requests before providers hard-fail
The runtime already tracked rough token estimates for compaction, but provider-bound requests still relied on naive model output limits and could be sent upstream even when the selected model could not fit the estimated prompt plus requested output. This adds a small model token/context registry in the API layer, estimates request size from the serialized prompt payload, and fails locally with a dedicated context-window error before Anthropic or xAI calls are made. Focused integration coverage asserts the preflight fires before any HTTP request leaves the process. Constraint: Keep the first pass minimal and reusable across both Anthropic and OpenAI-compatible providers Rejected: Auto-compact-and-retry in the same patch | broader control-flow change than the requested minimal preflight Confidence: medium Scope-risk: narrow Reversibility: clean Directive: Expand the model registry before enabling preflight for additional providers or aliases Tested: cargo build -p api -p tools -p rusty-claude-cli; cargo test -p api Not-tested: End-to-end CLI auto-compaction or retry behavior after a local context_window_blocked failure
This commit is contained in:
@@ -12,7 +12,7 @@ use crate::types::{
|
||||
ToolChoice, ToolDefinition, ToolResultContentBlock, Usage,
|
||||
};
|
||||
|
||||
use super::{Provider, ProviderFuture};
|
||||
use super::{preflight_message_request, Provider, ProviderFuture};
|
||||
|
||||
pub const DEFAULT_XAI_BASE_URL: &str = "https://api.x.ai/v1";
|
||||
pub const DEFAULT_OPENAI_BASE_URL: &str = "https://api.openai.com/v1";
|
||||
@@ -128,6 +128,7 @@ impl OpenAiCompatClient {
|
||||
stream: false,
|
||||
..request.clone()
|
||||
};
|
||||
preflight_message_request(&request)?;
|
||||
let response = self.send_with_retry(&request).await?;
|
||||
let request_id = request_id_from_headers(response.headers());
|
||||
let payload = response.json::<ChatCompletionResponse>().await?;
|
||||
@@ -142,6 +143,7 @@ impl OpenAiCompatClient {
|
||||
&self,
|
||||
request: &MessageRequest,
|
||||
) -> Result<MessageStream, ApiError> {
|
||||
preflight_message_request(request)?;
|
||||
let response = self
|
||||
.send_with_retry(&request.clone().with_streaming())
|
||||
.await?;
|
||||
|
||||
Reference in New Issue
Block a user