Skip to content

support loading pipeline from transformer style (flat) repo #14096

Open
yiyixuxu wants to merge 3 commits into
mainfrom
support-root-layout-weights
Open

support loading pipeline from transformer style (flat) repo #14096
yiyixuxu wants to merge 3 commits into
mainfrom
support-root-layout-weights

Conversation

@yiyixuxu

@yiyixuxu yiyixuxu commented Jun 30, 2026

Copy link
Copy Markdown
Collaborator

With this PR, we want to support loading a pipeline from google/diffusiongemma-26B-A4B-it, and also future model releases in the same style :

  • model released in transformers with the flat structure, i.e., model and processor are hosted at the root folder directly
  • we add pipeline + scheduler in diffusers, and would be able to add a scheduler folder and model_index.json file into the same repo

Currently, we have some legacy support for loading components from the root, but very limited (e.g., it does not work if the repo also has a subfolder, or if your component comes with files other than the regular weights, etc.) . Since all our releases use subfolder structure, this was sufficient as it is (I'm actually not sure why the legacy support was there at all; I assume a very early model released that way, and we kept it for backward compatibility).

This PR extends that a bit further to support transformer + diffusers release (we will likely see more)

  • make it work with a mixed structure (some components at the root, some in subfolders). Here, model and processor live in the root, scheduler has its own folder)
  • add a _allow_files field to model_index.json to allow the user to define an explicit list of filenames that should always be downloaded. this is to mirror the _ignore_files field we already support. For DiffusionGemma, we need to list the root-level processor/tokenizer config files etc.

you can test with https://huggingface.co/YiYiXu/diffusiongemma-test, the only difference is https://huggingface.co/YiYiXu/diffusiongemma-test/blob/main/model_index.json#L4

import torch
from diffusers import DiffusionGemmaPipeline

model_id = "YiYiXu/diffusiongemma-test"
pipe = DiffusionGemmaPipeline.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
)
pipe.to("cuda")

output = pipe(
    prompt="Why is the sky blue?",
    gen_length=256,
    generator=torch.Generator("cuda").manual_seed(42),
)
print(output.texts[0])

is_safetensors_compatible: filter to weight files up front so config-only
folders (e.g. scheduler/) no longer form spurious components. A pipeline whose
weights live at the repo root (rather than in component subfolders) is now
correctly detected as safetensors-compatible, instead of falling back to .bin
and dropping the root .safetensors.

_allow_files: optional list of exact filenames in model_index.json that are
added to the download set, for repos that keep a component's config/tokenizer
files at the root where the folder-based allow patterns would miss them.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@github-actions github-actions Bot added pipelines size/S PR with diff < 50 LOC labels Jun 30, 2026
@yiyixuxu yiyixuxu requested review from DN6 and dg845 June 30, 2026 09:29
@HuggingFaceDocBuilderDev

Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

# Files explicitly allow-listed by the repo author via `_allow_files` in `model_index.json` are
# added to the download set. This supports repos that keep a component's config/tokenizer files at
# the root (instead of in its own subfolder), where the folder-based allow patterns would miss them.
allow_patterns += allow_filenames

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yiyixuxu yiyixuxu Jun 30, 2026

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alternative is we could just add these processor-related files into the default allow_patterns here (without the model author having to specify in model_index.json), I leaned toward the design in this PR mainly because:

(1) it's more practical and more remote friendly - for a repo hosted this way with slightly different files, the model author can just update the model_index.json instead of PR into diffusers
(2) more fundamentally, I don't want to broadly support loading components from the roots - it is just too ambiguous for diffusers because we have multiple components and we often co-host our checkpoints in the same repo as the research repo's original checkpoint and their checkpoint is hosted at the root. We already have some implicit root-loading support today, but I don't think we need to expand it as default behavior. _allow_files keeps it explicit and opt-in

But I'm open to different opinions!

Cover root-level safetensors weights and the flat layout where weights live at
the root alongside a weight-less subfolder (e.g. scheduler/), which must not
prevent the root safetensors from being recognized.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@github-actions github-actions Bot added the tests label Jun 30, 2026
Mirrors test_smart_download against a unet-pipeline-dummy clone that lists
big_array.npy in _allow_files, asserting it is downloaded (whereas
test_smart_download asserts it is skipped without _allow_files).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
]
self.assertTrue(is_safetensors_compatible(filenames, variant="fp16"))

def test_diffusers_is_compatible_weightless_subfolder(self):

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only this test will have a different result from main
Currently, in main this will marked as not is_safetensors_compatible

@yiyixuxu yiyixuxu requested a review from sayakpaul June 30, 2026 19:38

passed_components = passed_components or []
# only weight files matter for safetensors compatibility
filenames = filter_model_files(filenames)

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think it's a bug here but let me know cc @DN6

@dg845 dg845 left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pipelines size/S PR with diff < 50 LOC tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants