pancake :verified: on Nostr: 115 tokens/second on M2 with 4bit quantificated phi3 model running with MLX. It's ...
Published at
2024-04-24 09:46:41Event JSON
{
"id": "4d5d8ef3c04627ff5c69025746df804551b26f3bb2b631d3b3197a11669cb5b7",
"pubkey": "ccca24c1e7c8dc9068b0c0f6ed38670d995d42c6f9ed5fdcc725baa0e39da1a6",
"created_at": 1713952001,
"kind": 1,
"tags": [
[
"proxy",
"https://infosec.exchange/users/pancake/statuses/112325558371867778",
"activitypub"
],
[
"L",
"pink.momostr"
],
[
"l",
"pink.momostr.activitypub:https://infosec.exchange/users/pancake/statuses/112325558371867778",
"pink.momostr"
]
],
"content": "115 tokens/second on M2 with 4bit quantificated phi3 model running with MLX. It's crazy how faster these technologies evolve in performance and quality results. Video ripped from https://twitter.com/awnihannun/status/1782813675075825955 \n\nSource: https://huggingface.co/mlx-community\nhttps://media.infosec.exchange/infosec.exchange/media_attachments/files/112/325/558/154/491/219/original/0b657d5fd70965e4.mp4\n",
"sig": "4c14ac4ab492f48e80cc7bd596ab2131d5e4fca04779b0b23057ae3151fe04e6b3de2e4106ff972f19cd68ee17cee7925ba7037ee110a019929eff31ae07d9c2"
}