Ars Technica on Nostr: New secret math benchmark stumps AI models and PhDs alike FrontierMath's difficult ...
New secret math benchmark stumps AI models and PhDs alike
FrontierMath's difficult questions remain unpublished so that AI companies can't train against it.
https://arstechnica.com/ai/2024/11/new-secret-math-benchmark-stumps-ai-models-and-phds-alike/?utm_brand=arstechnica&utm_social-type=owned&utm_source=mastodon&utm_medium=social
FrontierMath's difficult questions remain unpublished so that AI companies can't train against it.
https://arstechnica.com/ai/2024/11/new-secret-math-benchmark-stumps-ai-models-and-phds-alike/?utm_brand=arstechnica&utm_social-type=owned&utm_source=mastodon&utm_medium=social