Even some of the best AI can’t beat this new benchmark

The nonprofit Center for AI Safety (CAIS) and Scale AI, a company that provides a number of data labeling and AI development services, have released a challenging new benchmark for frontier AI systems. The benchmark, called Humanity’s Last Exam, includes thousands of crowdsourced questions touching on subjects like mathematics, humanities, and the natural sciences. To make […] © 2024 TechCrunch. All rights reserved. For personal use only.

Jan 24, 2025 - 07:18
 0
Even some of the best AI can’t beat this new benchmark

The nonprofit Center for AI Safety (CAIS) and Scale AI, a company that provides a number of data labeling and AI development services, have released a challenging new benchmark for frontier AI systems. The benchmark, called Humanity’s Last Exam, includes thousands of crowdsourced questions touching on subjects like mathematics, humanities, and the natural sciences. To make […]

© 2024 TechCrunch. All rights reserved. For personal use only.

What's Your Reaction?

like

dislike

love

funny

angry

sad

wow