Skip to content

fix: limit minerva sympy memory consumption#222

Open
mitja-kleider wants to merge 1 commit into
Aleph-Alpha-Research:mainfrom
mitja-kleider:minerva-memory-limit
Open

fix: limit minerva sympy memory consumption#222
mitja-kleider wants to merge 1 commit into
Aleph-Alpha-Research:mainfrom
mitja-kleider:minerva-memory-limit

Conversation

@mitja-kleider
Copy link
Copy Markdown

No description provided.

# Virtual-address-space budget for a single sympy simplify() call.
# Pathological expressions can cause sympy to allocate tens of GiBs;
# this cap turns that into a caught MemoryError instead of an OOM kill.
_SYMPY_MEMORY_BUDGET_BYTES = 2 * 1024**3 # 2 GiB
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This budget is quite arbitrary and not configurable, I'll leave it to the eval team to decide if this is sufficient.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. Why is this better than an OOM Kill? Because then we are "more sure" that the OOM was caused by the benchmark container and not something else?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the OOMKill will kill the whole eval, while this will just cause the function to return False. Not entirely sure if this is the desired behavior or whether we instead somehow want to mark the sample as "can not be processed in eval".

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. This starts a sub-process for each and every sample to check and only the sub-process "is terminated". I wonder if this is a bit expensive?!? This benchmark already runs pretty long. Do you have an idea about the impact on the runtime?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it seems to be too expensive. @prabhuteja12 has tested it. As all intermediate checkpoint evals were successful now, I am wondering whether this is worth it at all or if we just keep it as is and restarted flaky future evals.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants