By Tegan Jones
Publication Date: 2026-02-13 00:57:00
Welcome back to Neural Notes, a weekly column where I discuss how AI is impacting Australia. In this issue: Arena was once a niche research project that has become a de facto public arbiter for ChatGPT, Claude, Gemini, and more. But how much should founders actually trust its rankings?
If you google or research forums for the best AI model, you might eventually land on Arena – a live leaderboard where models from OpenAI, Anthropic, Google, DeepSeek and others compete side by side in anonymous comparisons.
What arena actually is
It was originally called LMArena, but was renamed simply Arena at the end of January.
The interface is simple. You enter a prompt, get two answers, and vote for the better answer.
Only after voting will you see which model produced the respective answer. In my review below I asked for the best Banh Mi in Sydney. I preferred Assistant A’s answer, which turned out to be Claude. Assistant B turned out to be twins.
This choice leads to a…