Chi Kim on Nostr: Llama.cpp now supports the distributed inference, meaning you can use multiple ...
Published at
2024-05-23 21:33:44Event JSON
{
"id": "ea3ceb91488a950ed938cce02884f3d63c09fe9bb1cc60225d69048e2a02b0ff",
"pubkey": "9bae200856dfd9681a9424eaf3ad571f2c812635728d2d1f54b5a9439d6b3933",
"created_at": 1716500024,
"kind": 1,
"tags": [
[
"t",
"llm"
],
[
"t",
"ai"
],
[
"t",
"ml"
],
[
"proxy",
"https://mastodon.social/users/chikim/statuses/112492545623302956",
"activitypub"
]
],
"content": "Llama.cpp now supports the distributed inference, meaning you can use multiple computers to speed up the response time! Network is the main bottleneck, so all machines need to be hard wired, not connected through wifi. ##LLm #AI #ML https://github.com/ggerganov/llama.cpp/tree/master/examples/rpc",
"sig": "4f03748d440d11b2328e1984b005927b6fcfbd0e2ae0b9a0b7506a2c2d0e9b3300dca8c465a9a66ff0e4652481ca2e4b5b64708ca659fd13fefe6bad760e4f07"
}