Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
34 changes: 29 additions & 5 deletions crates/goose/src/providers/databricks.rs
Original file line number Diff line number Diff line change
Expand Up @@ -412,16 +412,40 @@ impl Provider for DatabricksProvider {
let status = resp.status();
let error_text = resp.text().await.unwrap_or_default();

// Parse as JSON if possible to pass to map_http_error_to_provider_error
let json_payload = serde_json::from_str::<Value>(&error_text).ok();
return Err(map_http_error_to_provider_error(status, json_payload));
}
Ok(resp)
})
.await
.inspect_err(|e| {
let _ = log.error(e);
})?;
.await;
Comment on lines 412 to +420

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Bypass generic retries before removing stream_options

On Gemini-backed Databricks endpoints this 400 is deterministic, but the request goes through with_retry() before the stream_options fallback runs. I checked crates/goose/src/providers/retry.rs, and should_retry() treats every ProviderError::RequestFailed as retryable, so affected requests will send the same bad payload up to four times with backoff before line 423 finally strips stream_options. That adds several seconds of latency and extra billable calls to every streaming turn on the models this patch is trying to unblock.

Useful? React with 👍 / 👎.


let response = match response {
Err(e) if e.to_string().contains("stream_options") => {
payload.as_object_mut().unwrap().remove("stream_options");
self.with_retry(|| async {
Comment on lines +423 to +425

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve usage accounting in the stream_options fallback

When a Databricks endpoint rejects stream_options, this branch drops the whole object before retrying. On the OpenAI-compatible path, stream_openai_compat() only produces ProviderUsage from streamed usage blocks, and crates/goose/src/agents/agent.rs:1232-1234 only updates session metrics when that usage arrives. As a result, every Gemini/Databricks turn that takes this fallback can succeed while recording no token usage or cost data, silently under-reporting spend for the exact models this patch is meant to unblock.

Useful? React with 👍 / 👎.

let resp = self
.api_client
.response_post(Some(session_id), &path, &payload)
.await?;
if !resp.status().is_success() {
let status = resp.status();
let error_text = resp.text().await.unwrap_or_default();
let json_payload = serde_json::from_str::<Value>(&error_text).ok();
return Err(map_http_error_to_provider_error(status, json_payload));
}
Ok(resp)
})
.await
.inspect_err(|e| {
let _ = log.error(e);
})?
}
Err(e) => {
let _ = log.error(&e);
return Err(e);
}
Ok(resp) => resp,
};

stream_openai_compat(response, log)
}
Expand Down
74 changes: 66 additions & 8 deletions crates/goose/src/providers/formats/openai.rs
Original file line number Diff line number Diff line change
Expand Up @@ -48,9 +48,25 @@ struct DeltaToolCall {
extra: Option<serde_json::Map<String, Value>>,
}

#[derive(Serialize, Deserialize, Debug)]
#[serde(untagged)]
enum DeltaContent {
String(String),
Array(Vec<ContentPart>),
}

#[derive(Serialize, Deserialize, Debug)]
struct ContentPart {
r#type: String,
text: String,
#[serde(rename = "thoughtSignature")]
thought_signature: Option<String>,
}

#[derive(Serialize, Deserialize, Debug)]
struct Delta {
content: Option<String>,
#[serde(default)]
content: Option<DeltaContent>,
role: Option<String>,
tool_calls: Option<Vec<DeltaToolCall>>,
reasoning_details: Option<Vec<Value>>,
Expand All @@ -74,6 +90,32 @@ struct StreamingChunk {
model: Option<String>,
}

fn extract_content_and_signature(
delta_content: Option<&DeltaContent>,
) -> (Option<String>, Option<String>) {
match delta_content {
Some(DeltaContent::String(s)) => (Some(s.clone()), None),
Some(DeltaContent::Array(parts)) => {
let text_parts: Vec<_> = parts.iter().filter(|p| p.r#type == "text").collect();

let text = text_parts
.iter()
.map(|p| p.text.as_str())
.collect::<String>();
Comment on lines +99 to +104

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Keep thought parts out of visible assistant text

Gemini's OpenAI-compatible chat completions support include_thoughts, and those thought summaries arrive as content parts rather than reasoning_content. This code concatenates every type == "text" part into MessageContent::text, but ContentPart never records whether a part was a thought, so any thought: true segment will be shown and persisted as normal assistant output instead of hidden MessageContent::Thinking. In thought-enabled sessions that leaks the model's internal reasoning to users.

Useful? React with 👍 / 👎.


let signature = text_parts
.iter()
.find_map(|p| p.thought_signature.as_ref())
Comment on lines +106 to +108

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Use the last thoughtSignature in a streamed content array

Gemini 3 places the continuation signature on the last part of a non-function response, but extract_content_and_signature() keeps the first thoughtSignature it sees. If a single delta.content array contains multiple text parts, last_signature can be set to a stale value, and the next tool call/follow-up turn will be replayed with the wrong signature. That breaks the continuation contract this patch is trying to restore and can reproduce Gemini's missing/invalid thought_signature 400s on later turns.

Useful? React with 👍 / 👎.

.cloned();

let text = if text.is_empty() { None } else { Some(text) };

(text, signature)
}
None => (None, None),
}
}

pub fn format_messages(messages: &[Message], image_format: &ImageFormat) -> Vec<Value> {
let mut messages_spec = Vec::new();
for message in messages {
Expand Down Expand Up @@ -564,6 +606,7 @@ where

let mut accumulated_reasoning: Vec<Value> = Vec::new();
let mut accumulated_reasoning_content = String::new();
let mut last_signature: Option<String> = None;

'outer: while let Some(response) = stream.next().await {
let response_str = response?;
Expand Down Expand Up @@ -685,14 +728,23 @@ where
serde_json::from_str::<Value>(arguments)
};

let metadata = extra_fields.as_ref().filter(|m| !m.is_empty());
let metadata = if let Some(sig) = &last_signature {
let mut combined = extra_fields.clone().unwrap_or_default();
combined.insert(
crate::providers::formats::google::THOUGHT_SIGNATURE_KEY.to_string(),
json!(sig)
);
Some(combined)
} else {
extra_fields.as_ref().filter(|m| !m.is_empty()).cloned()
};

let content = match parsed {
Ok(params) => {
MessageContent::tool_request_with_metadata(
id.clone(),
Ok(CallToolRequestParams::new(function_name.clone()).with_arguments(object(params))),
metadata,
metadata.as_ref(),
)
},
Err(e) => {
Expand All @@ -704,7 +756,7 @@ where
)),
data: None,
};
MessageContent::tool_request_with_metadata(id.clone(), Err(error), metadata)
MessageContent::tool_request_with_metadata(id.clone(), Err(error), metadata.as_ref())
}
};
contents.push(content);
Expand All @@ -731,13 +783,20 @@ where

if let Some(reasoning) = &chunk.choices[0].delta.reasoning_content {
if !reasoning.is_empty() {
content.push(MessageContent::thinking(reasoning, ""));
let signature = last_signature.as_deref().unwrap_or("");
content.push(MessageContent::thinking(reasoning, signature));
}
}

if let Some(text) = &chunk.choices[0].delta.content {
let (text_content, thought_signature) = extract_content_and_signature(chunk.choices[0].delta.content.as_ref());

if let Some(sig) = thought_signature {
last_signature = Some(sig);
}

if let Some(text) = text_content {
if !text.is_empty() {
content.push(MessageContent::text(text));
content.push(MessageContent::text(&text));
Comment on lines +791 to +799

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Reuse a stable id when array-content chunks omit id

Databricks/Gemini chunks on this new DeltaContent::Array path can have "id": null (the example in this commit does), and ui/desktop/src/hooks/useChatStream.ts:178-212 only coalesces streamed deltas when consecutive messages share an id. Because these array-content chunks now flow through as separate assistant messages with no fallback id, the desktop stream renders one timestamped bubble per delta instead of extending a single reply, which matches the regression called out in the commit message.

Useful? React with 👍 / 👎.

}
Comment on lines +797 to 800

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Preserve thought signatures on plain-text assistant turns

This branch updates last_signature and then downgrades the signed Gemini part to MessageContent::Text, which means the signature is discarded unless a tool call happens later in the same response. In format_messages() assistant text is serialized back as bare text, so a successful first reply from Gemini 3 cannot be appended "including thought signatures" on the next request. For text-only chats that turns this fix into a one-turn success followed by lost reasoning context or another thought_signature validation error on the next turn.

Useful? React with 👍 / 👎.

}

Expand All @@ -748,7 +807,6 @@ where
content,
);

// Add ID if present
if let Some(id) = chunk.id {
msg = msg.with_id(id);
}
Expand Down
Loading