What is Nostr?
Nazo /
npub1c9j…w6r3
2024-12-20 13:11:39
in reply to nevent1q…v2l2

Nazo on Nostr: nprofile1q…mvvfe I presume this means in the training itself? According to RULER it ...

nprofile1qy2hwumn8ghj7un9d3shjtnddaehgu3wwp6kyqpqsu4t3c2h2fl2ws6hfcnn5r7jdn87qvhvaafe5uv3hcwygxz3rjss4mvvfe (nprofile…vvfe) I presume this means in the training itself? According to RULER it seems even most 128K context models are 64K at best. So they've also been taking shortcuts as a way to get that cost down...

I wonder if they really even need that much the way most people are using them. Most stuff is pretty close to one-shot. Kind of wasteful really in those cases.
Author Public Key
npub1c9jd3t9c50qej4rc0gudnvta8qsn7htm3pg5wwg86t7n45kes59sluw6r3