support loading pipeline from transformer style (flat) repo #14096
support loading pipeline from transformer style (flat) repo #14096yiyixuxu wants to merge 3 commits into
Conversation
is_safetensors_compatible: filter to weight files up front so config-only folders (e.g. scheduler/) no longer form spurious components. A pipeline whose weights live at the repo root (rather than in component subfolders) is now correctly detected as safetensors-compatible, instead of falling back to .bin and dropping the root .safetensors. _allow_files: optional list of exact filenames in model_index.json that are added to the download set, for repos that keep a component's config/tokenizer files at the root where the folder-based allow patterns would miss them. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
| # Files explicitly allow-listed by the repo author via `_allow_files` in `model_index.json` are | ||
| # added to the download set. This supports repos that keep a component's config/tokenizer files at | ||
| # the root (instead of in its own subfolder), where the folder-based allow patterns would miss them. | ||
| allow_patterns += allow_filenames |
There was a problem hiding this comment.
need to work with field like this https://huggingface.co/YiYiXu/diffusiongemma-test/blob/main/model_index.json#L4
There was a problem hiding this comment.
alternative is we could just add these processor-related files into the default allow_patterns here (without the model author having to specify in model_index.json), I leaned toward the design in this PR mainly because:
(1) it's more practical and more remote friendly - for a repo hosted this way with slightly different files, the model author can just update the model_index.json instead of PR into diffusers
(2) more fundamentally, I don't want to broadly support loading components from the roots - it is just too ambiguous for diffusers because we have multiple components and we often co-host our checkpoints in the same repo as the research repo's original checkpoint and their checkpoint is hosted at the root. We already have some implicit root-loading support today, but I don't think we need to expand it as default behavior. _allow_files keeps it explicit and opt-in
But I'm open to different opinions!
Cover root-level safetensors weights and the flat layout where weights live at the root alongside a weight-less subfolder (e.g. scheduler/), which must not prevent the root safetensors from being recognized. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Mirrors test_smart_download against a unet-pipeline-dummy clone that lists big_array.npy in _allow_files, asserting it is downloaded (whereas test_smart_download asserts it is skipped without _allow_files). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
| ] | ||
| self.assertTrue(is_safetensors_compatible(filenames, variant="fp16")) | ||
|
|
||
| def test_diffusers_is_compatible_weightless_subfolder(self): |
There was a problem hiding this comment.
only this test will have a different result from main
Currently, in main this will marked as not is_safetensors_compatible
|
|
||
| passed_components = passed_components or [] | ||
| # only weight files matter for safetensors compatibility | ||
| filenames = filter_model_files(filenames) |
There was a problem hiding this comment.
i think it's a bug here but let me know cc @DN6
With this PR, we want to support loading a pipeline from
google/diffusiongemma-26B-A4B-it, and also future model releases in the same style :model_index.jsonfile into the same repoCurrently, we have some legacy support for loading components from the root, but very limited (e.g., it does not work if the repo also has a subfolder, or if your component comes with files other than the regular weights, etc.) . Since all our releases use subfolder structure, this was sufficient as it is (I'm actually not sure why the legacy support was there at all; I assume a very early model released that way, and we kept it for backward compatibility).
This PR extends that a bit further to support transformer + diffusers release (we will likely see more)
modelandprocessorlive in the root,schedulerhas its own folder)_allow_filesfield tomodel_index.jsonto allow the user to define an explicit list of filenames that should always be downloaded. this is to mirror the_ignore_filesfield we already support. For DiffusionGemma, we need to list the root-level processor/tokenizer config files etc.you can test with https://huggingface.co/YiYiXu/diffusiongemma-test, the only difference is https://huggingface.co/YiYiXu/diffusiongemma-test/blob/main/model_index.json#L4