i switched to this swift tool for fine tuning LLMs. ...

2024-10-07 15:53:24

i switched to this swift tool for fine tuning LLMs.

https://github.com/modelscope/ms-swift

works very well. very easy. llama-factory is probably easier but i found this to be more capable like distributing lora fine tuning properly to GPUs.

previously i did fine tuning of a 70B model in fsdp-qlora method using llama-factory. now i am doing lora with rank 32 using swift. batch_size=2 helped a lot with avoiding overfitting.

if you want to ask questions to the most capable model, the most based, the weirdest answers (compared to mainstream) dm me. i will give you a link.

Author Public Key

npub1nlk894teh248w2heuu0x8z6jjg2hyxkwdc8cxgrjtm9lnamlskcsghjm9c

Seen on

wss://e.nos.lol wss://nos.lol wss://nostr.mom wss://relay.nostr.band wss://relay.primal.net

Show more details

someone on Nostr: i switched to this swift tool for fine tuning LLMs. ...