Tyler the Enginigger on Nostr: fp/int and the 32/16/8/4 refer to the type and size of the values used in the model. ...
fp/int and the 32/16/8/4 refer to the type and size of the values used in the model. The more bytes per each value, the more accurate it is, but it takes more bytes, and is bigger, takes up more disk space and also more vram.
When writing these, they can be sometimes loaded stupidly. For example in the function that does memory management for this thing:
I might be able to free up vram by separating these two that are loaded into vram at the same time into different steps, hopefully without needing to load/unload 50x but if it's what I have to do, it's what I have to do.
Or I can try to offload to CPU, or if those fail, just set it to do the work on the CPU above.
Also, chinks misspelled # empty cuda cache
When writing these, they can be sometimes loaded stupidly. For example in the function that does memory management for this thing:
I might be able to free up vram by separating these two that are loaded into vram at the same time into different steps, hopefully without needing to load/unload 50x but if it's what I have to do, it's what I have to do.
Or I can try to offload to CPU, or if those fail, just set it to do the work on the CPU above.
Also, chinks misspelled # empty cuda cache
