Update, I was told about theoretical results that show that some quite simple tasks ...

Update, I was told about theoretical results that show that some quite simple tasks like parity checking aren't possible for certain classes of transformers, at least in the limit of large numbers.

https://arxiv.org/abs/2311.00208

This all got me thinking about what is the simplest class of problems these models get systematically wrong that people would mostly get right? I like my matching parentheses examples for that but I also really like the suggestion "pick two random numbers between 1445 and 1743 and multiply them together" which at least in my test it gets very systematically wrong (although many humans would struggle too). From:

https://medium.com/@konstantine_45825/gpt-4-cant-reason-2eab795e2523

Any better ideas?

Dan Goodman on Nostr: Update, I was told about theoretical results that show that some quite simple tasks ...