What is Nostr?
Dan Goodman /
npub1pmx…q5qg
2024-10-15 16:33:12

Dan Goodman on Nostr: Interesting feature of the Apple LLM reasoning paper. I always tell my students that ...

Interesting feature of the Apple LLM reasoning paper. I always tell my students that exams include no irrelevant information, which gives you a clue as to the answer. LLM's have learnt this too well, and can't ignore irrelevant info (perf drops 17-70%).

https://arxiv.org/abs/2410.05229

In other words: exam-style questions have massive leakage that many students don't pick up on but that LLM's find impossible to ignore. I suspect this tells us more about the way we write exam questions than anything else. They're not a good measure of LLM performance, nor human!

Author Public Key
npub1pmxyaxv6dm8xnd99c80fqyvtjmsmp68gv026tdyfsyz4000ttk0s48q5qg