fix: repair pad_id fallback order in finetuning generators#86
Open
peter941221 wants to merge 2 commits into
Open
fix: repair pad_id fallback order in finetuning generators#86peter941221 wants to merge 2 commits into
peter941221 wants to merge 2 commits into
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Bug
Both finetuning token generators compute eos_token before assigning self.pad_id.
In finetuning_token_generator.py, the eos_id is None fallback reads self.pad_id before self.pad_id = pad_id runs. The sibling implementation in finetuning_token_generator_mllama.py has the same ordering.
That leaves both constructors with the same invalid path when eos_id is None.
Fix
Assign self.pad_id before computing self.eos_token in both generators.
This is an order-only change. It keeps the existing fallback behavior and makes the eos_id is None path valid.
Verification
I reproduced the failure in both generators with an isolated constructor stub that passes eos_id=None.
Before the change:
AttributeError: 'FinetuningTokenGenerator' object has no attribute 'pad_id'
After the change:
pad_id=99, eos_id=None, eos_token=tok-99
I also added a focused regression test at src/cerebras/modelzoo/data_preparation/data_preprocessing/test_finetuning_token_generator_init.py.
Verification commands:
PYTHONPATH=src python3 -m unittest src/cerebras/modelzoo/data_preparation/data_preprocessing/test_finetuning_token_generator_init.py
python3 -m py_compile src/cerebras/modelzoo/data_preparation/data_preprocessing/finetuning_token_generator.py src/cerebras/modelzoo/data_preparation/data_preprocessing/finetuning_token_generator_mllama.py
Fixes #64.
This applies the same constructor-order repair to the multimodal sibling generator.