Skip to content

Conversation

@finbarrtimbers
Copy link
Collaborator

@finbarrtimbers finbarrtimbers commented Nov 24, 2025

Changes needed to support this:

  • Instance level session cache, instead of class level session cache
  • Uses backoff for retries, instead of requests native support

I am currently working on moving the calls to reward_fn in #1225, which requires calling the reward function asynchronously from LLMRayActor, and if we use asyncio.to_thread it starves the thread pool.

Runs:


Note

Replaces requests with aiohttp in CodeVerifier to enable native async calls with backoff retries and event-loop–scoped session caching, adds session cleanup, introduces aiohttp dependency, and documents a Beaker logs command.

  • Verifier (open_instruct/ground_truth_utils.py):
    • Switch HTTP client from requests to aiohttp; implement native-async _verify_code and async_call.
    • Add exponential backoff (backoff) for post retries; compute dynamic timeouts.
    • Introduce event-loop–scoped session cache via weakref.WeakKeyDictionary; add cleanup_all_sessions.
    • Update sync __call__ to use a fresh aiohttp.ClientSession via asyncio.run.
    • Minor type hints and code extraction (extract_python_code unchanged).
  • Dependencies:
    • Add aiohttp>=3.9.0 to pyproject.toml (lockfile updated accordingly).
  • Docs:
    • Add Beaker logs tip in AGENTS.md (beaker experiment logs $EXPERIMENT_ID).

Written by Cursor Bugbot for commit 25391d7. This will update automatically on new commits. Configure here.

@finbarrtimbers finbarrtimbers changed the title Switches CodeVerifier to use aiohttp instead of requests, enabling natively async requests, and avoiding using asyncio.to_thread. Switches CodeVerifier to use aiohttp instead of requests, enabling native async, and avoiding using asyncio.to_thread. Nov 25, 2025
@finbarrtimbers finbarrtimbers marked this pull request as ready for review November 25, 2025 17:25
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

max_tries=3,
max_time=60,
giveup=lambda e: isinstance(e, aiohttp.ClientResponseError) and e.status < 500,
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Timeout retries waste execution server resources

The backoff decorator retries on asyncio.TimeoutError, which includes timeouts from the session.post() call. When code execution legitimately exceeds the timeout duration, retrying won't help and wastes execution server resources by running the same slow code multiple times. The previous requests implementation only retried specific 5xx status codes and didn't retry on timeouts.

Fix in Cursor Fix in Web

backoff.expo,
(aiohttp.ClientError, asyncio.TimeoutError),
max_tries=3,
max_time=60,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Backoff max_time too short for long requests

The backoff max_time=60 is too short for requests that can timeout after up to 300 seconds. When a request times out after 300 seconds, the elapsed time exceeds max_time, preventing any retries despite max_tries=3. This makes the retry mechanism ineffective for slow code executions that legitimately timeout.

Fix in Cursor Fix in Web

@finbarrtimbers
Copy link
Collaborator Author

Turns out this isn't actually the bottleneck, so I'm closing this for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants