Codeplay Software
ML-Triton, A Multi-Level Compilation and Language Extension to Triton GPU Programming

ML-Triton introduces a multi-level compilation strategy and language extensions for Triton GPU programming, allowing high-level Python-like kernels to achieve performance exceeding 95% of expert-written reference libraries on Intel GPUs. This approach significantly enhances performance for key AI workloads, including GEMM, FlashAttention-2, and Paged Attention.

View blog
Resources
There are no more papers matching your filters at the moment.