tests: fix bootstrap mock + force-OOM in crash-recovery test by ykhrustalev · Pull Request #6 · Liquid4All/lqh

ykhrustalev · 2026-05-15T23:57:17Z

The two bootstrap tests asserted on uv venv / python3 -m venv commands that bootstrap_remote never issued: the catch-all mock returned rc=0 for test -x .lqh-env/bin/python, so bootstrap hit the "venv already exists, reusing" idempotency branch and skipped venv creation entirely. Have the mock signal "venv missing" for that probe so the creation path runs and the assertion is exercised.

The OOM crash test seeded the dataset with trivially short content ("Convert file_X.mp4 to mp3"), so the SFT collator padded to the longest sample in the batch (~30 tokens), not max_seq_length=8192. The advertised bs=512 × seq=8192 config collapsed to bs=20 × seq=30 and ran to completion on any modern GPU. Seed the 20 samples with long content (~60K chars) so tokenization overshoots max_seq_length and truncation fills the seq dim — verified to raise torch.OutOfMemoryError on an A10 (24GB), exercising the crash-detection plumbing the test was meant to cover.

The two bootstrap tests asserted on ``uv venv`` / ``python3 -m venv`` commands that bootstrap_remote never issued: the catch-all mock returned rc=0 for ``test -x .lqh-env/bin/python``, so bootstrap hit the "venv already exists, reusing" idempotency branch and skipped venv creation entirely. Have the mock signal "venv missing" for that probe so the creation path runs and the assertion is exercised. The crash-recovery test seeded the dataset with trivially short content ("Convert file_X.mp4 to mp3"), so the SFT collator padded to the longest sample in the batch (~30 tokens), not max_seq_length=8192. The advertised ``bs=512 × seq=8192`` config collapsed to ``bs=20 × seq=30`` and ran to completion on any modern GPU. Replace the OOM mechanism with a deterministic crash trigger: point ``base_model`` at a nonexistent HuggingFace identifier so the subprocess fails fast at model-load time. GPU-size agnostic, faster (~9s vs ~15s), and tests the same plumbing — subprocess exits non-zero, ``SubprocessManager.get_status`` transitions to ``failed``, stderr captures the traceback. Renamed to ``test_subprocess_crash_detected`` to reflect the actual coverage; OOM- specific verification (if needed) belongs in a separate small test. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

ykhrustalev marked this pull request as draft May 15, 2026 23:57

ykhrustalev force-pushed the claude/fix-bootstrap-and-oom-tests branch from e53f917 to d37d396 Compare May 16, 2026 00:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tests: fix bootstrap mock + force-OOM in crash-recovery test#6

tests: fix bootstrap mock + force-OOM in crash-recovery test#6
ykhrustalev wants to merge 1 commit into
mainfrom
claude/fix-bootstrap-and-oom-tests

ykhrustalev commented May 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ykhrustalev commented May 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant