What is Nostr?
Tyler the Enginigger /
npub1a8p…2ncx
2025-01-27 23:35:58
in reply to nevent1q…9ccp

Tyler the Enginigger on Nostr: I am going to try this today: https://github.com/Tencent/Hunyuan3D-2 Not the ...

I am going to try this today: https://github.com/Tencent/Hunyuan3D-2

Not the deepseek-r1 because I got bored of it

You can run almost any model on the cpu with your normal ram, but I think, for some cards and for some cpus and for some versions of linux on some drivers you can either do cpu offload with a gpu or only gpu

Because it's in such a fast moving field I can't tell you ## b models are good or better or behave this way or fit in this much vram etc because it also depends if they're using fp32/16/8 or int8/4 quantization to reduce total size at the sacrifice of accuracy.

For example, the one I linked above, I don't have enough vram. I do for ONE model, but the gradio_app.py wants. Sometimes you can do cpu_offload=True somewhere and it will be MUCH slower but you can at least run it.

It's a pretty complicated field, as far as new concepts and what effects what, and I'm still trying to figure it all out.

Ironically, this memory issue could be fixed if nvidia would just support sharing system ram with vram by extending the address space, but that would defeat the purpose of them wanting you to buy newer more expensive cards with more vram, wouldn't it?

Author Public Key
npub1a8p5fz2ja6dsntqvngf7x2ej7gp43q2w8e85lfnxqexsqdslpmdqhe2ncx