Google,Watch Bedroom Eyes (2017) full movie OpenAI, DeepSeek, et al. are nowhere near achieving AGI (Artificial General Intelligence), according to a new benchmark.
The Arc Prize Foundation, a nonprofit that measures AGI progress, has a new benchmark that is stumping the leading AI models. The test, called ARC-AGI-2 is the second edition ARC-AGI benchmark that tests models on general intelligence by challenging them to solve visual puzzles using pattern recognition, context clues, and reasoning.
This Tweet is currently unavailable. It might be loading or has been removed.
According to the ARC-AGI leaderboard, OpenAI's most advanced model o3-low scored 4 percent. Google's Gemini 2.0 Flash and DeepSeek R1 both scored 1.3 percent. Anthropic's most advanced model, Claude 3.7 with an 8K token limit (which refers to the amount of tokens used to process an answer) scored 0.9 percent.
The question of how and when AGI will be achieved remains as heated as ever, with various factions bickering about the timeline or whether it's even possible. Anthropic CEO Dario Amodei said it could take as little as two to three years, and OpenAI CEO Sam Altman said "it's achievable with current hardware." But experts like Gary Marcus and Yann LeCun say the technology isn't there yet and it doesn't take an expert to see how fueling AGI hype is advantageous to AI companies seeking major investments.
The ARC-AGI benchmark is designed to challenge AI models beyond specialized intelligence by avoiding the memorization trap — spewing out PhD-level responses without an understanding of what it means. Instead it focuses on puzzles that are relatively easy for humans to solve because of our innate ability to take in new information and make inferences, thus revealing gaps that can't be resolved by simply feeding AI models more data.
"Intelligence requires the ability to generalize from limited experience and apply knowledge in new, unexpected situations. AI systems are already superhuman in many specific domains (e.g., playing Go and image recognition)" read the announcement.
SEE ALSO: I compared Sesame to ChatGPT voice mode and I'm unnerved"However, these are narrow, specialized capabilities. The 'human-ai gap' reveals what's missing for general intelligence - highly efficiently acquiring new skills."
To get a sense of AI models' current limitations, you can take the ARC-AGI test for yourself. And you might be surprised by its simplicity. There's some critical thinking involved, but the ARC-AGI test wouldn't be out of place next to the New York Timescrossword puzzle, Wordle, or any of the other popular brain teasers. It's challenging but not impossible and the answer is there in the puzzle's logic, which is something the human brain has evolved to interpret.
OpenAI's o3-low model scored 75.7 percent on the first edition of ARC-AGI. By comparison, its 4 percent score on the second edition shows how difficult the test is, but also how there's a lot more work to be done with reaching human level intelligence.
Topics Google OpenAI
Australian MP Tim Wilson proposes to his partner while speaking on marriage equality in parliamentMarvel's 'Loki' is a cheeky, confident timeTrump's Facebook ban conveniently set to expire before next presidential electionBrow crowns have surfaced to make you look and feel like a fashion queenSenators set to vote on the GOP tax bill and they can't even read it'Sweet Tooth' is an apocalyptic fairy tale with a beautiful heartThe sweet story of how a 22Nigeria bans Twitter for deleting a post from the presidentApple's 'private relay' feature won't be available in ChinaPolice requests for Ring videos have to be made in public nowYou'll never lose your AirPods again with Apple's iOS 15Hubble's close25 gifts for the 'Stranger Things'India's prime minister is the second mostCOVID Discord group helps Indians find oxygen, answers, and communityWhy apologies for sexual misconduct will always feel hollowThe world's biggest Starbucks outlet is also ARA full bar threw a surprise birthday party for one lucky bulldog6 ideas to improve for the dog walking app Wag!Apple's perfect world is out of reach for most Apple users Sprawling Typhoon Megi plows into Taiwan with howling winds, flooding rains Trump says Miss Universe winner Alicia Machado gained 'a massive amount of weight' LeBron James has 'respect' for Kaepernick but will stand for national anthem Esports franchise Team Liquid has new owners, investment from Magic Johnson Clinton was interrupted constantly by Trump and shimmied her way through it all Voters are using YouTube for election news in ways you wouldn't expect Howard Dean has a very interesting theory about Donald Trump's sniffles Trump campaign unveils 'Crooked Hillary' Snapchat filter before debate What Benedict Cumberbatch confessed on the 'Doctor Strange' set One of the most livable cities is hiding a disturbing reality CNN said Hillary won the debate. Why do so many polls seem to say otherwise? Google launches YouTube Go for India This woman just might be the world's youngest airline captain Edward Snowden's favorite messaging app is now on desktop It only takes a single piece of paper for this artist to create beauty Kid Cudi releases tracklist for new album, 'Passion, Pain & Demon Slayin' Streaking is outlawed in some parts of Australia, so here are 6 of the best The entire Marlins team will wear No. 16 in José Fernandez tribute Inside Google's plan to conquer India's internet users Police have no chill about people looking to 'moon' parliament
2.7433s , 8225.875 kb
Copyright © 2025 Powered by 【Watch Bedroom Eyes (2017) full movie】,Defense Information Network