jb55 on Nostr: #LoRA makes me bullish on #ai hacking on consumer hardware while still leveraging ...
#LoRA makes me bullish on #ai hacking on consumer hardware while still leveraging large pretrained models as a base. It works by putting lower rank matrices at each layer of the transformer stack in a larger base model like llama and training those.
The base model’s weights are frozen, but you can train these low-rank “adapters” which are much smaller and require less memory/compute.
Nice thing about fine tuning is that you are basically teaching the ai new things that it won’t forget all the time. So we can give it lots of domain knowledge about nostr, nips, etc. hardest part is setting up a good training dataset.
The base model’s weights are frozen, but you can train these low-rank “adapters” which are much smaller and require less memory/compute.
Nice thing about fine tuning is that you are basically teaching the ai new things that it won’t forget all the time. So we can give it lots of domain knowledge about nostr, nips, etc. hardest part is setting up a good training dataset.