Andrej Karpathy / @karpathy (RSS Feed) on Nostr: **RT @cHHillee:** Happy to OSS gpt-fast, a fast and hackable implementation of ...
**RT @cHHillee:**
Happy to OSS gpt-fast, a fast and hackable implementation of transformer inference in <1000 lines of native PyTorch with support for quantization, speculative decoding, TP, Nvidia/AMD support, and more!
Code: github.com/pytorch-labs/gpt-… (https://github.com/pytorch-labs/gpt-fast)
Blog: pytorch.org/blog/acceleratin… (https://pytorch.org/blog/accelerating-generative-ai-2/)
(1/12)
https://nitter.moomoo.me/cHHillee/status/1730293330213531844#m
Happy to OSS gpt-fast, a fast and hackable implementation of transformer inference in <1000 lines of native PyTorch with support for quantization, speculative decoding, TP, Nvidia/AMD support, and more!
Code: github.com/pytorch-labs/gpt-… (https://github.com/pytorch-labs/gpt-fast)
Blog: pytorch.org/blog/acceleratin… (https://pytorch.org/blog/accelerating-generative-ai-2/)
(1/12)
https://nitter.moomoo.me/cHHillee/status/1730293330213531844#m