ARC Prize for AGI
See the ARC Leaderboard to track contenders for the prize.
via Ben’s Bites
“LLMs are not the way to AGI.” There’s a large camp of AI researchers who believe this to be true. But what to do about it? François Chollet, the creator of Keras and Mike Knopp, co-founder of Zapier are launching a challenge to find alternative ways to get to “general intelligence”.
What is going on here?
ARC Prize is offering $1M to improve the reasoning of AI models (not just LLMs).
What does this mean?
ARC stands for Abstraction and Reasoning Corpus. It’s a collection of tasks where you have to observe a few input-output pairs and predict the output for a similar input. We humans are pretty good at it.
Just try this example. You can easily guess the output for the Test Input.
SPOILER: Just get rid of the smaller piece. Play it here and guess the colour.
But what if AI systems try it? Well, the current state-of-the-art performance is 34% accuracy (humans are at 85%). So, the ARC-AGI challenge thinks this is a good challenge to test general reasoning.
It defines AGI as a system that can efficiently acquire new skills outside of its training data. And to top it all, there’s a million-dollar prize pool for beating humans at this challenge.
François and Mike talk about the challenge more on Dwarkesh Patel’s podcast and the No Priors Pod.
Zvi responds by saying that the problem is solvable if you allow for memorization, which he claims is what humans kinda do anyway most of the time.
The market says 51% chance the prize is claimed by end of year 2025 and 23% by end of this year.
Melanie Mitchell notes the Goodharts Law problem where people try to solve this with brute force
Why should I care?
The explanation by the organizers makes sense. LLMs don’t reason, instead, they copy-paste reasoning patterns in the vast amount of data they consume while training.
The organizers also argue that the progress in AI has turned closed source and the narrative of “scale is all you need” is hurting the direction of global research. New ideas are needed and I’m up for it (except that I spent way too much time playing the puzzles last night).