AttentionΒΆ

This file hosts the high-performance reference implementation for FlashAttention (forward & backward), and attention blocks that are used in Stable Diffusion models.

flash_fwd

Flash Attention Forward kernel

flash_attn_bwd

Flash attention backward kernel.

fused_self_attn_for_SD_small_head_size

Fused self attention kernel for small head size Stable Diffusion workload.