Attention¶

This file hosts the high-performance reference implementation for FlashAttention (forward & backward), and attention blocks that are used in Stable Diffusion models.

`flash_fwd`	Flash Attention Forward kernel
`flash_attn_bwd`	Flash attention backward kernel.
`fused_self_attn_for_SD_small_head_size`	Fused self attention kernel for small head size Stable Diffusion workload.

Attention¶

Previous topic

Next topic

This Page