The PhD students who became the judges of the AI industry

Artificial intelligence models are multiplying fast, and competition is stiff. With so many players crowding the space, which one will be the best — and who decides that? Arena, formerly LM Arena, has emerged as the de facto public leaderboard for frontier LLMs, influencing funding, launches, and PR cycles. In just seven months, the startup went from a UC Berkeley PhD research project to being valued at $1.7 billion

On this episode of TechCrunch’s Equity podcast, Rebecca Bellan catches up with Arena co-founders Anastasios Angelopoulos and Wei-Lin Chiang to determine how a team like theirs can build a neutral benchmark when the companies they’re ranking are also their backers. 

Listen to the full episode to hear: 

  • How Arena actually works, and why its founders say you can’t game it the way you mighta static benchmark. 
  • What “structural neutrality” actually means, and whether taking money from OpenAI, Google, and Anthropic is a conflict of interest. 
  • How Arena is moving beyond chat to benchmark agents, coding, and real-world tasks with a new enterprise product. 
  • Why Claude is currently winning the expert leaderboard for legal and medical use cases. 
  • Arena’s bet on what comes after LLMs, and why agents are next on the leaderboard. 

Subscribe to Equity on YouTube, Apple Podcasts, Overcast, Spotify and all the casts. You also can follow Equity on X and Threads, at @EquityPod.