AttentionΒΆ
This file hosts the high-performance reference implementation for FlashAttention (forward & backward), and attention blocks that are used in Stable Diffusion models.
Flash Attention Forward kernel |
|
Flash attention backward kernel. |
|
Fused self attention kernel for small head size Stable Diffusion workload. |