Skip to content

fix(attention_dispatch): use correct attr names for FLASH_VARLEN_HUB …#14015

Open
Vitaliy-Pikalo wants to merge 1 commit into
huggingface:mainfrom
Vitaliy-Pikalo:patch-1
Open

fix(attention_dispatch): use correct attr names for FLASH_VARLEN_HUB …#14015
Vitaliy-Pikalo wants to merge 1 commit into
huggingface:mainfrom
Vitaliy-Pikalo:patch-1

Conversation

@Vitaliy-Pikalo

Copy link
Copy Markdown

The kernels-community/flash-attn2 hub kernel exposes:

  • flash_attn_interface._flash_attn_varlen_forward
  • flash_attn_interface._flash_attn_varlen_backward

But _HUB_KERNELS_REGISTRY[FLASH_VARLEN_HUB] was referencing the non-existent:

  • flash_attn_interface._wrapped_flash_attn_varlen_forward
  • flash_attn_interface._wrapped_flash_attn_varlen_backward

Those _wrapped_* attributes don't exist in the hub varlen kernel, so _flash_attention_varlen_hub would fail with an AttributeError when trying to resolve them at runtime.

For reference, FLASH_HUB (non-varlen flash-attn2) correctly uses _wrapped_flash_attn_forward/backward — the standard flash-attn2 library does expose those names. But the hub varlen kernel uses the unwrapped form (no _wrapped_ prefix), consistent with how _FLASH_3_HUB (flash-attn3) uses _flash_attn_forward/backward.

This PR also updates the RuntimeError message in _flash_attention_varlen_hub (both forward and backward ops) to match the corrected attribute names.

Fixes #14012.

…kernels-community/flash-attn2

The `kernels-community/flash-attn2` hub kernel exposes:
- `flash_attn_interface._flash_attn_varlen_forward`
- `flash_attn_interface._flash_attn_varlen_backward`

But `_HUB_KERNELS_REGISTRY[FLASH_VARLEN_HUB]` was referencing:
- `flash_attn_interface._wrapped_flash_attn_varlen_forward`
- `flash_attn_interface._wrapped_flash_attn_varlen_backward`

Those `_wrapped_*` attributes do not exist in the hub kernel, causing `_flash_attention_varlen_hub` to fail with an AttributeError when trying to resolve them.

Contrast with `FLASH_HUB` (non-varlen flash-attn2), which correctly uses `_wrapped_flash_attn_forward/backward` — the standard flash-attn2 library does expose those names, but the hub varlen kernel uses the unwrapped form.

Also updates the RuntimeError message in `_flash_attention_varlen_hub` to match the corrected attribute names.

Fixes huggingface#14012.
@github-actions github-actions Bot added size/S PR with diff < 50 LOC models fixes-issue labels Jun 21, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

fixes-issue models size/S PR with diff < 50 LOC

Projects

None yet

Development

Successfully merging this pull request may close these issues.

attention dispatcher assumes wrong attributes for flash attn kernel from hub

1 participant