having bad LLMs can allow us to find truth faster. reinforcement algorithm could be: ...

2025-01-31 15:04:36

having bad LLMs can allow us to find truth faster. reinforcement algorithm could be: "take what a proper model says and negate what a bad LLM says". then the convergence will be faster with two wings!

Author Public Key

npub1nlk894teh248w2heuu0x8z6jjg2hyxkwdc8cxgrjtm9lnamlskcsghjm9c

Show more details

someone on Nostr: having bad LLMs can allow us to find truth faster. reinforcement algorithm could be: ...