What is Nostr?
Tom Walker /
npub1q88…60ks
2023-09-12 11:01:00

Tom Walker on Nostr: Every so often I see a post about how LLMs fail logic puzzles. And... yes? Of course ...

Every so often I see a post about how LLMs fail logic puzzles.

And... yes? Of course they do. The only way it could solve it is if it has seen the puzzle before or a substantially similar one. (But that might cause it to give the answer to the similar one, not the correct answer.)

Why is this even tested so often or considered surprising? It is, in essence, an autocomplete. It does not understand logic. It has no concept of a correct answer. It gives the most likely completion.
Author Public Key
npub1q88kqzwung4z5upfg55f3j7uk5xfz3yuuat5g0m5wgy5ek05fucsz260ks