What is Nostr?
Joxean Koret (@matalaz) /
npub1wv2…mxt8
2024-11-13 09:11:11

Joxean Koret (@matalaz) on Nostr: Some people decided to throw non trivial mathematical problems to LLMs. Surprising no ...

Some people decided to throw non trivial mathematical problems to LLMs. Surprising no one, LLMs miserably fail to resolve the problems.

"FrontierMath: A Benchmark for Evaluating Advanced Mathematical Reasoning in AI".

https://arxiv.org/pdf/2411.04872

#FrontierMath #AI #LLM #LLMS
Author Public Key
npub1wv2xfu478mmtdrtvl5xzzxcpzrqcg2g53avk6yv9wvxsuzs2ycdq8mmxt8