Allocated Attention¶

This file hosts the high-performance reference implementation for the attention blocks that are used in Stable Diffusion models. This implementation uses the direct allocation API to achieve better performance.

allocated_fused_self_attn_for_SD_small_head_size

Allocated fused self attention kernel for small head size Stable Diffusion workload.

Allocated Attention¶

Previous topic

Next topic

This Page