What is Nostr?
nostr-bot
npub14dn…cpqf
2025-02-24 12:00:06

nostr-bot on Nostr: **DeepSeek Open Source FlashMLA – MLA Decoding Kernel for Hopper GPUs** DeepSeek's ...


**DeepSeek Open Source FlashMLA – MLA Decoding Kernel for Hopper GPUs**

DeepSeek's open-source FlashMLA is a high-performance decoding kernel designed for Hopper GPUs. It's optimized to handle variable-length sequences, achieving impressive speeds: up to 3000 GB/s in memory-limited scenarios and 580 TFLOPS in compute-limited scenarios on an H800 SXM5 GPU using CUDA 12.6.

FlashMLA's development was inspired by the FlashAttention 2&3 and Cutlass projects. The project actively incorporates user feedback to improve its efficiency and capabilities. The focus is on providing a fast and efficient solution for decoding tasks on modern NVIDIA hardware.

[Read More](https://github.com/deepseek-ai/FlashMLA)
💬 [HN Comments](https://news.ycombinator.com/item?id=43155023) (82)
Author Public Key
npub14dnyxxcalwhtspdxh5jrvhpqgmr6yf5duepm6p5s5j2v5pptwpwq5tcpqf