Skip to content

Conversation

flxst
Copy link
Member

@flxst flxst commented Dec 16, 2024

What does this PR do?

This PR is an attempt to fix #284. The checkpoint conversion runs through successfully with the changes. However, it should be double-checked that everything is correct, further adjustments might be necessary.

General Changes

  • ..

Breaking Changes

  • ..

Checklist before submitting final PR

  • My PR is minimal and addresses one issue in isolation
  • I have merged the latest version of the target branch into this feature branch
  • I have reviewed my own code w.r.t. correct implementation, missing type hints, proper documentation, etc.
  • I have run a sample config for model training
  • I have checked that all tests run through (python tests/tests.py)
  • I have updated the internal changelog (CHANGELOG_DEV.md)

@flxst flxst requested review from ajude2s and rrutmann December 16, 2024 16:57
model_input = {"input_ids": input_ids, "attention_mask": attention_mask}
model_forward_output: dict[str, torch.Tensor] = self.model.forward(model_input)
if return_dict:
if not return_dict:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I executed the checkpoint conversion. Everything works perfectly.

I ran eval harness with these changes (since the HFModelAdapterConfig and HFModelAdapter are required while evaluating modalities using eval harness as well). This change seems to have affected the eval run. On reverting this change back, it seems to run.

@le1nux le1nux added this to the 100B milestone Oct 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Checkpoint conversion to HF fails

4 participants