someone on Nostr: having bad LLMs can allow us to find truth faster. reinforcement algorithm could be: ...
having bad LLMs can allow us to find truth faster. reinforcement algorithm could be: "take what a proper model says and negate what a bad LLM says". then the convergence will be faster with two wings!
Published at
2025-01-31 15:04:36Event JSON
{
"id": "8753ab08cc9bcf1f08ed37c6cecce7e62925d57beb790e4d342797306090a17b",
"pubkey": "9fec72d579baaa772af9e71e638b529215721ace6e0f8320725ecbf9f77f85b1",
"created_at": 1738335876,
"kind": 1,
"tags": [],
"content": "having bad LLMs can allow us to find truth faster. reinforcement algorithm could be: \"take what a proper model says and negate what a bad LLM says\". then the convergence will be faster with two wings!",
"sig": "f0f0f20764240da3c241f9e124c30aefd591b6eee50c106472bca40ad8a029dbed0f037047ee2f3793f7453ef4ced615288a0dd0011d58441164e5fcd7c21136"
}