Joxean Koret (@matalaz) on Nostr: Some people decided to throw non trivial mathematical problems to LLMs. Surprising no ...
Some people decided to throw non trivial mathematical problems to LLMs. Surprising no one, LLMs miserably fail to resolve the problems.
"FrontierMath: A Benchmark for Evaluating Advanced Mathematical Reasoning in AI".
https://arxiv.org/pdf/2411.04872
#FrontierMath #AI #LLM #LLMS
"FrontierMath: A Benchmark for Evaluating Advanced Mathematical Reasoning in AI".
https://arxiv.org/pdf/2411.04872
#FrontierMath #AI #LLM #LLMS