Skip to content

Conversation

SamuelGabriel
Copy link
Contributor

Summary:
The MAST-calling jobs now have a retry count s.t. they are restarted if they fail e.g. due to manifold errors. Additionally, I have added a timeout of 12 hours s.t. a job that gets stuck fails after that time.

This of course has the downside, that if a job is meant to take this long it will be failed, I'd say we should not really have any jobs like this, though, if we properly parallelize, no? Our nightlies for example take 5 hours at most (which is already a little unnecessarily long, given that we can parallelize things basically ad infinitum, no?)

Differential Revision: D81128135

Summary:
The MAST-calling jobs now have a retry count s.t. they are restarted if they fail e.g. due to manifold errors. Additionally, I have added a timeout of 12 hours s.t. a job that gets stuck fails after that time.

This of course has the downside, that if a job is meant to take this long it will be failed, I'd say we should not really have any jobs like this, though, if we properly parallelize, no? Our nightlies for example take 5 hours at most (which is already a little unnecessarily long, given that we can parallelize things basically ad infinitum, no?)

Differential Revision: D81128135
@meta-cla meta-cla bot added the CLA Signed Do not delete this pull request or issue due to inactivity. label Sep 26, 2025
@facebook-github-bot
Copy link
Contributor

@SamuelGabriel has exported this pull request. If you are a Meta employee, you can view the originating diff in D81128135.

@codecov-commenter
Copy link

Codecov Report

❌ Patch coverage is 50.00000% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 96.26%. Comparing base (044a97a) to head (8773009).

Files with missing lines Patch % Lines
ax/benchmark/methods/modular_botorch.py 50.00% 1 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main    #4362   +/-   ##
=======================================
  Coverage   96.26%   96.26%           
=======================================
  Files         564      564           
  Lines       57914    57916    +2     
=======================================
+ Hits        55751    55753    +2     
  Misses       2163     2163           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed Do not delete this pull request or issue due to inactivity. fb-exported meta-exported
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants