-
Notifications
You must be signed in to change notification settings - Fork 471
Switches CodeVerifier to use aiohttp instead of requests, enabling native async, and avoiding using asyncio.to_thread.
#1227
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
CodeVerifier to use aiohttp instead of requests, enabling natively async requests, and avoiding using asyncio.to_thread.CodeVerifier to use aiohttp instead of requests, enabling native async, and avoiding using asyncio.to_thread.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| max_tries=3, | ||
| max_time=60, | ||
| giveup=lambda e: isinstance(e, aiohttp.ClientResponseError) and e.status < 500, | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Timeout retries waste execution server resources
The backoff decorator retries on asyncio.TimeoutError, which includes timeouts from the session.post() call. When code execution legitimately exceeds the timeout duration, retrying won't help and wastes execution server resources by running the same slow code multiple times. The previous requests implementation only retried specific 5xx status codes and didn't retry on timeouts.
| backoff.expo, | ||
| (aiohttp.ClientError, asyncio.TimeoutError), | ||
| max_tries=3, | ||
| max_time=60, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Backoff max_time too short for long requests
The backoff max_time=60 is too short for requests that can timeout after up to 300 seconds. When a request times out after 300 seconds, the elapsed time exceeds max_time, preventing any retries despite max_tries=3. This makes the retry mechanism ineffective for slow code executions that legitimately timeout.
|
Turns out this isn't actually the bottleneck, so I'm closing this for now. |
Changes needed to support this:
backofffor retries, instead ofrequestsnative supportI am currently working on moving the calls to
reward_fnin #1225, which requires calling the reward function asynchronously from LLMRayActor, and if we useasyncio.to_threadit starves the thread pool.Runs:
Note
Replaces
requestswithaiohttpinCodeVerifierto enable native async calls with backoff retries and event-loop–scoped session caching, adds session cleanup, introducesaiohttpdependency, and documents a Beaker logs command.open_instruct/ground_truth_utils.py):requeststoaiohttp; implement native-async_verify_codeandasync_call.backoff) forpostretries; compute dynamic timeouts.weakref.WeakKeyDictionary; addcleanup_all_sessions.__call__to use a freshaiohttp.ClientSessionviaasyncio.run.extract_python_codeunchanged).aiohttp>=3.9.0topyproject.toml(lockfile updated accordingly).AGENTS.md(beaker experiment logs $EXPERIMENT_ID).Written by Cursor Bugbot for commit 25391d7. This will update automatically on new commits. Configure here.