NKI Samples¶
nki_samples.reference¶
All kernels located in this folder have numeric accuracy tests and performance benchmarks defined in the test directory. We also demonstrate using these kernels end-to-end in our integration tests.
You are welcome to customize them to fit your unique workloads, and contributing to the repository by opening a PR.
Note that these kernels are already being deployed as part of the Neuron stack. With flash attention as an example,
compiling Llama models with transformers-neuronx
will automatically invoke the flash_fwd
kernel listed here. Therefore, replacing the framework operators with these
NKI kernels likely won’t result in extra performance benefit.
Please see the README page of the GitHub Repository nki-samples for more details.
For NKI documentation, please refer to the main Neuron SDK documentation page.
Relationship to neuronxcc.nki.kernels
¶
The kernels under reference
folder is also available in the neuronxcc.nki.kernels
namespace. The
kernels in the neuronxcc
is synced with this repository on every Neuron SDK release.
nki_samples.tutorial¶
Please refer to this page for the tutorials. The code associated with the tutorial can be found at nki-samples/src/tutorials