Skip to content

Conversation

@aibrahim-oai
Copy link
Collaborator

@aibrahim-oai aibrahim-oai commented Oct 24, 2025

move the truncation logic to conversation history to use on any tool output. This will help us in avoiding edge cases while truncating the tool calls and mcp calls.

Copy link
Contributor

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@aibrahim-oai
Copy link
Collaborator Author

@codex review this

Copy link
Contributor

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

pub(crate) fn get_formatted_input(&self) -> Vec<ResponseItem> {
let mut input = self.input.clone();
// when using the *Freeform* apply_patch tool specifically, tool outputs
// should be structured text, not json. Do NOT reserialize when using
// the Function tool - note that this differs from the check above for
// instructions. We declare the result as a named variable for clarity.
let is_freeform_apply_patch_tool_present = self.tools.iter().any(|tool| match tool {
ToolSpec::Freeform(f) => f.name == "apply_patch",
_ => false,
});
if is_freeform_apply_patch_tool_present {
reserialize_shell_outputs(&mut input);
}

P1 Badge Shell tool outputs still serialized as JSON unless freeform apply_patch is enabled

The new truncation flow expects function_call_output content sent back to the model to be plain text with Exit code/Wall time headers, and the added integration tests assert that these outputs are not valid JSON. However, Prompt::get_formatted_input only converts the JSON payload produced by format_exec_output_for_model into a structured plain‑text string when a freeform apply_patch tool is present (client_common.rs lines 71‑84). For normal sessions that use only the standard shell tool, the reserialization step never runs, so the next prompt still contains the original JSON blob. This leaves the function_call_output valid JSON (and without the expected line prefixes), causing the new tests to fail and the model to see a different format than intended. The reserialization should happen whenever shell outputs are present, not only when the freeform apply_patch tool is configured.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@aibrahim-oai aibrahim-oai changed the title truncate Centralize truncation in conversation history Oct 24, 2025
let output = mock2
.single_request()
.function_call_output_text(call_id)
.expect("function_call_output present for rmcp call");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: use ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and anyhow::Result

// Verifies that an MCP tool call result exceeding the model formatting limits
// is truncated before being sent back to the model.
#[tokio::test(flavor = "multi_thread", worker_threads = 1)]
async fn mcp_tool_call_output_exceeds_limit_truncated_for_model() -> Result<()> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we have all truncation tests in one place or colocated with tool-specific tests ( next to other mcp tests)?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's distributed now like the shell tool ones. I feel we should at least have some of them in one place to test the truncation functionality. wdut?

5
6
.*
\[\.{3} omitted \d+ of 400 lines \.{3}\]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we fully hardcode?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

.expect("function error output present");

// Expect plaintext with byte-truncation marker and no omitted-lines marker
assert!(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we use regex match approach ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wait should we hardcode or use regex?

}
}

impl ConversationHistory {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why second impl?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be removed

}
}

fn truncate_context_output(content: &str) -> String {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: remove

format_exec_output(content)
}

pub(crate) fn format_exec_output(content: &str) -> String {
Copy link
Collaborator

@pakrym-oai pakrym-oai Oct 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: rename, it's not exec specific

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right. I just copied the code and forgot that lol. thanks

ResponseItem::FunctionCallOutput {
call_id: call_id.clone(),
output: FunctionCallOutputPayload {
content: truncated,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't content sometimes JSON encoded?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it can be for MCPs for something. It won't affect our main tools tho. The model doesn't care because it's text but I guess it can lose understanding of the json schema. I think MCP servers should handle their output to not overwhelm the model. This is our line of defense against violators. thoughts?

let truncated_by_bytes = content.len() > MODEL_FORMAT_MAX_BYTES;
let marker = if omitted > 0 {
Some(format!(
"\n[... omitted {omitted} of {total_lines} lines ...]\n\n"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be a bit wrong. You are counting metadata lines and not just shell output lines.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

true. it would be +3 or sth.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants