
Hugging Face
Machine Learning•2026-04-29
AI evals are becoming the new compute bottleneck
AI evaluation has become a critical cost bottleneck that determines who can conduct evaluations, with the Holistic Agent Leaderboard spending $40,000 for 21,730 agent rollouts and individual GAIA runs costing $2,829.