Skip to content

Fix flash-attn2 hub-kernel attribute names in attention dispatch (#14012)#14016

Closed
Alex-Wengg wants to merge 1 commit into
huggingface:mainfrom
Alex-Wengg:fix/14012-flash-attn2-hub-kernel-attrs
Closed

Fix flash-attn2 hub-kernel attribute names in attention dispatch (#14012)#14016
Alex-Wengg wants to merge 1 commit into
huggingface:mainfrom
Alex-Wengg:fix/14012-flash-attn2-hub-kernel-attrs

Conversation

@Alex-Wengg

Copy link
Copy Markdown

What does this PR do?

Fixes #14012.

The hub-kernel registry (_HUB_KERNELS_REGISTRY in attention_dispatch.py) points the FLASH_HUB and FLASH_VARLEN_HUB backends (both kernels-community/flash-attn2) at attribute paths that don't exist in that kernel:

  • flash_attn_interface._wrapped_flash_attn_forward / _wrapped_flash_attn_backward
  • flash_attn_interface._wrapped_flash_attn_varlen_forward / _wrapped_flash_attn_varlen_backward

kernels-community/flash-attn2 exposes these without the _wrapped_ prefix (_flash_attn_forward, _flash_attn_varlen_forward, …), so _resolve_kernel_attr raises AttributeError: ... does not define attribute path 'flash_attn_interface._wrapped_flash_attn_varlen_forward' when the flash-attn2 hub backend is used.

This is corroborated by the FLASH_3_HUB config directly above, which already uses the non-_wrapped_ names (flash_attn_interface._flash_attn_forward / _flash_attn_backward).

Fix

Use the actual (non-_wrapped_) attribute names for the flash-attn2 hub configs. Four strings; no behavior change beyond resolving the correct symbols.

Verification

Introspecting the kernel (no flash-attn compile, no model download):

from kernels import get_kernel
fi = get_kernel("kernels-community/flash-attn2", version=1).flash_attn_interface

# names the registry currently looks for — all MISSING:
[hasattr(fi, n) for n in ("_wrapped_flash_attn_forward", "_wrapped_flash_attn_backward",
                          "_wrapped_flash_attn_varlen_forward", "_wrapped_flash_attn_varlen_backward")]
# -> [False, False, False, False]

# names this PR uses — all PRESENT:
[hasattr(fi, n) for n in ("_flash_attn_forward", "_flash_attn_backward",
                          "_flash_attn_varlen_forward", "_flash_attn_varlen_backward")]
# -> [True, True, True, True]

After the fix, _maybe_download_kernel_for_backend(...) resolves both forward and backward for FLASH_HUB and FLASH_VARLEN_HUB (no AttributeError). Verified on kernels==0.12.3.

AI assistance (Claude) was used to investigate and draft this change; the diff and verification were reviewed and run locally.

@sayakpaul

The FLASH_HUB and FLASH_VARLEN_HUB configs in _HUB_KERNELS_REGISTRY point at
flash_attn_interface._wrapped_flash_attn_{,varlen_}{forward,backward}, which do
not exist in kernels-community/flash-attn2 (it exposes the non-_wrapped_ names).
_resolve_kernel_attr therefore raises AttributeError when the flash-attn2 hub
backend is used. Use the actual attribute names, matching the FLASH_3_HUB config.

Fixes huggingface#14012

Co-authored-by: Claude <noreply@anthropic.com>
@github-actions github-actions Bot added fixes-issue models size/S PR with diff < 50 LOC and removed fixes-issue labels Jun 21, 2026
@Alex-Wengg Alex-Wengg closed this Jun 21, 2026
@sayakpaul

Copy link
Copy Markdown
Member

Will be fixed in huggingface/kernels-community#976. Cc: @DN6 @drbh

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

models size/S PR with diff < 50 LOC

Projects

None yet

Development

Successfully merging this pull request may close these issues.

attention dispatcher assumes wrong attributes for flash attn kernel from hub

2 participants