Tyler the Enginigger on Nostr: I am going to try this today: https://github.com/Tencent/Hunyuan3D-2 Not the ...
I am going to try this today: https://github.com/Tencent/Hunyuan3D-2
Not the deepseek-r1 because I got bored of it
You can run almost any model on the cpu with your normal ram, but I think, for some cards and for some cpus and for some versions of linux on some drivers you can either do cpu offload with a gpu or only gpu
Because it's in such a fast moving field I can't tell you ## b models are good or better or behave this way or fit in this much vram etc because it also depends if they're using fp32/16/8 or int8/4 quantization to reduce total size at the sacrifice of accuracy.
For example, the one I linked above, I don't have enough vram. I do for ONE model, but the gradio_app.py wants. Sometimes you can do cpu_offload=True somewhere and it will be MUCH slower but you can at least run it.
It's a pretty complicated field, as far as new concepts and what effects what, and I'm still trying to figure it all out.
Ironically, this memory issue could be fixed if nvidia would just support sharing system ram with vram by extending the address space, but that would defeat the purpose of them wanting you to buy newer more expensive cards with more vram, wouldn't it?
Not the deepseek-r1 because I got bored of it
You can run almost any model on the cpu with your normal ram, but I think, for some cards and for some cpus and for some versions of linux on some drivers you can either do cpu offload with a gpu or only gpu
Because it's in such a fast moving field I can't tell you ## b models are good or better or behave this way or fit in this much vram etc because it also depends if they're using fp32/16/8 or int8/4 quantization to reduce total size at the sacrifice of accuracy.
For example, the one I linked above, I don't have enough vram. I do for ONE model, but the gradio_app.py wants. Sometimes you can do cpu_offload=True somewhere and it will be MUCH slower but you can at least run it.
It's a pretty complicated field, as far as new concepts and what effects what, and I'm still trying to figure it all out.
Ironically, this memory issue could be fixed if nvidia would just support sharing system ram with vram by extending the address space, but that would defeat the purpose of them wanting you to buy newer more expensive cards with more vram, wouldn't it?
