[swe] Add truncation fix for miles agentic tool call by maocheng23 · Pull Request #785 · radixark/miles

maocheng23 · 2026-03-21T08:49:28Z

As title, add length limit for agentic tool call

gemini-code-assist · 2026-03-21T08:49:46Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances the agentic tool call system by introducing a configurable mechanism to limit the total token length of agent responses. It integrates this limit across the server and client components, ensuring that responses exceeding the specified token count are either truncated or result in an 'OutputLimitExceeded' status, thereby preventing excessively long outputs and improving resource management.

Highlights

Response Token Limit: Introduced a new parameter, max_total_response_tokens, to specify the maximum allowed token length for agentic tool call responses, encompassing both prompt and completion tokens.
Error Handling for Output Limits: Added TotalResponseLengthExceededError to the server's exception handling, allowing the system to report an 'OutputLimitExceeded' status when responses exceed the defined token limit.
Client-Side Truncation: Implemented client-side logic to truncate agent responses if their total token count surpasses max_total_response_tokens, ensuring adherence to the specified limit.
Command-Line Argument: Provided a new command-line argument, --max-total-response-tokens, for easy configuration of the response token limit when invoking agentic tool calls.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request adds a mechanism to limit the total number of tokens in an agentic tool call by truncating the response. The changes span across the server, the agent function, and the rollout generation logic, introducing a new max_total_response_tokens parameter. The implementation correctly propagates this parameter and applies truncation when the limit is exceeded. I've found one potential issue regarding robustness in the new truncation logic that could lead to a TypeError if certain sample properties are None.

gemini-code-assist · 2026-03-21T08:56:51Z

miles/rollout/generate_utils/openai_endpoint_utils.py

+    sample.rollout_log_probs = sample.rollout_log_probs[:keep_tokens]
+    sample.loss_mask = sample.loss_mask[:keep_tokens]


The rollout_log_probs and loss_mask attributes of a Sample can be None according to their type hints in miles/utils/types.py. Slicing a None value will raise a TypeError. To make this utility function more robust, you should add checks to ensure these attributes are not None before attempting to slice them.

Suggested change

sample.rollout_log_probs = sample.rollout_log_probs[:keep_tokens]

sample.loss_mask = sample.loss_mask[:keep_tokens]

if sample.rollout_log_probs is not None:

sample.rollout_log_probs = sample.rollout_log_probs[:keep_tokens]

if sample.loss_mask is not None:

sample.loss_mask = sample.loss_mask[:keep_tokens]

Shi-Dong · 2026-03-22T19:14:52Z

examples/experimental/swe-agent-v2/server.py

        if exc_type in _TIMEOUT_EXCEPTIONS:
            return "LimitsExceeded"
+        if exc_type in _OUTPUT_LIMIT_EXCEPTIONS:
+            return "OutputLimitExceeded"


It's a bit weird to have both LimitsExceeded and OutputLimitExceeded. I'd propose changing LimitsExceeded to TimeLimitExceeded, and OutputLimitExceeded to SequenceLengthLimitExceeded.

More generally I think passing literal strings around isn't really the best practice? I'm OK with it as-is, though, since everything in under experimental.

Shi-Dong · 2026-03-22T19:21:45Z

miles/rollout/generate_hub/agentic_tool_call.py

+
+    metadata = input.sample.metadata
+    if max_total_response_tokens is not None:
+        metadata = {**metadata, "max_total_response_tokens": max_total_response_tokens}


Super nit: I think Python style guide would recommend metadata |= {"max_total_response_tokens": max_total_response_tokens} since this modifies the dictionary metadata in-place, instead of creating a new dictionary instance and assigning it to metadata.

Here I would prefer to do copy and paste as it's safer.

Shi-Dong · 2026-03-22T19:34:32Z

miles/rollout/generate_utils/openai_endpoint_utils.py

+        allowed_output = sample.response_length - overshoot
+        if allowed_output <= 0:
+            break


Here if I understand correctly, if the prompt part of a sample already exceeds the length limit, then the entire sample is dropped. I think this is quite dangerous since if max_total_response_tokens is so small such that the prompt of the first sample crosses the limit, then the returned list will be empty, which crashes merge_samples in sample_utils.py.

I'm not sure whether we can do the following: if the prompt in a sample has 500 tokens, and the response in the same sample has 200 tokens, with token limit 400, then we clip the sample, keeping just the first 400 tokens in the prompt with no response.

Also I'd recommend changing the name max_total_response_tokens into max_total_sample_tokens or something similar, since we are counting both prompt and response tokens.

There are two parts:

Here we actually collect all samples from a single trajectory, and then merge later, so we will still have some data (unless the initial prompt is super long)

update the name to max_seq_len

[swe] Add truncation fix for miles agentic tool call

cb27de5

maocheng23 requested review from fzyzcjy, guapisolo and yueming-yuan as code owners March 21, 2026 08:49

gemini-code-assist bot reviewed Mar 21, 2026

View reviewed changes

Shi-Dong reviewed Mar 22, 2026

View reviewed changes

update name to max_seq_len

c99672e

maocheng23 requested a review from Shi-Dong March 24, 2026 05:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[swe] Add truncation fix for miles agentic tool call#785

[swe] Add truncation fix for miles agentic tool call#785
maocheng23 wants to merge 2 commits intomainfrom
feat/add_response_max_length

maocheng23 commented Mar 21, 2026

Uh oh!

gemini-code-assist bot commented Mar 21, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Mar 21, 2026

Uh oh!

Shi-Dong Mar 22, 2026

Uh oh!

maocheng23 Mar 24, 2026

Uh oh!

Shi-Dong Mar 22, 2026

Uh oh!

maocheng23 Mar 23, 2026

Uh oh!

Shi-Dong Mar 22, 2026

Uh oh!

maocheng23 Mar 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		sample.rollout_log_probs = sample.rollout_log_probs[:keep_tokens]
		sample.loss_mask = sample.loss_mask[:keep_tokens]

Conversation

maocheng23 commented Mar 21, 2026

Uh oh!

gemini-code-assist bot commented Mar 21, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

Shi-Dong Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

maocheng23 Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

Shi-Dong Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

maocheng23 Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

Shi-Dong Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

maocheng23 Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants