jimbocoin on Nostr: I think it’s talking about graphics card vRAM. If your graphics card doesn’t have ...
I think it’s talking about graphics card vRAM. If your graphics card doesn’t have the capabilities and vRAM to run and fit the model, it stays in CPU mode. In that mode, I think it doesn’t load the whole thing in regular system RAM. Instead, it pages into the model from disk. If you look at disk I/O, you’ll probably see it spiking while the model is working on a prompt.
Published at
2024-07-07 01:29:22Event JSON
{
"id": "b759f64c49048690a2c20a5296263713ee29f1e7e14a69dde3a70f788ab12cd4",
"pubkey": "6140478c9ae12f1d0b540e7c57806649327a91b040b07f7ba3dedc357cab0da5",
"created_at": 1720315762,
"kind": 1,
"tags": [
[
"e",
"7d10d6425d552561cbef67c99eb49584c39449b014e3ebe00a845bf423671dba",
"",
"root"
],
[
"p",
"2b2c779db75f6363fbad7567dec2726d36aba05893b714001e0563cabef84f56"
]
],
"content": "I think it’s talking about graphics card vRAM. If your graphics card doesn’t have the capabilities and vRAM to run and fit the model, it stays in CPU mode. In that mode, I think it doesn’t load the whole thing in regular system RAM. Instead, it pages into the model from disk. If you look at disk I/O, you’ll probably see it spiking while the model is working on a prompt.",
"sig": "72d5561d2e9bf9c00c84b0af69a5bb02bbeda61a2238d9eb555bbddb9021a7645715712afb8f6598bbbdcf64ab76163b38d7a68ca6500d6598c4715a71d43296"
}