Creating Better AI Benchmarks: How Many Raters is Enough?

Creating Better AI Benchmarks: How Many Raters is Enough?

By research.google
Publication Date: 2026-03-31 12:00:00

Google Research explores the trade-off between the number of articles and human raters per article to improve the reproducibility of AI benchmarks and capture the nuances of human disagreement.