OpenAI O3 crushes benchmark tests and puzzles but is it intelligence ?
There is some substantial costs of $5000 to $500,000 of compute to solve the hard problems.
The AI compute costs will become vastly more efficient and lower cost.
The problems have correct answers but there can be more issues around more compromises and shades of grey problems.
Anything with clear loss functions are solvable with LLM and the OpenAI systems.




Brian Wang is a Futurist Thought Leader and a popular Science blogger with 1 million readers per month. His blog Nextbigfuture.com is ranked #1 Science News Blog. It covers many disruptive technology and trends including Space, Robotics, Artificial Intelligence, Medicine, Anti-aging Biotechnology, and Nanotechnology.
Known for identifying cutting edge technologies, he is currently a Co-Founder of a startup and fundraiser for high potential early-stage companies. He is the Head of Research for Allocations for deep technology investments and an Angel Investor at Space Angels.
A frequent speaker at corporations, he has been a TEDx speaker, a Singularity University speaker and guest at numerous interviews for radio and podcasts. He is open to public speaking and advising engagements.
When a binary computer ask’s Who am I”? I’ll consider it “self-aware”, maybe. I’d ask many questions. The more questions IT would ask, the more likely I would say “perhaps”. One resilient hallmark of biological life is it’s ability to adapt. Any child does not need to be “taught this”, they just know… Years ago, I was running a program that had several million lines of code. Then one day, it stopped. Period. The “autopsy” revealed 7 1’s&0s in one line were “wrong”. Seriously?! Yup. My point? Living systems are self-correcting/repairing. Or they don’t stay alive.
I don’t know how to impart that innate instinct to survive into a non-biological machine. I also must ask, do I want to?
^Anything with clear loss functions are solvable with LLM and the OpenAI systems.^
Yes with brute force algos, so openai has no talent left
Of course there is more than one definition of AGI. The YouTube guy basically says AGI should be able to find an optimal path that addresses multiple objectives while the test that was given consisted of cognitive problems that had just one correct answer – so not a real test of AGI.
However according to Wiki “Artificial general intelligence (AGI) is a type of artificial intelligence (AI) that matches or surpasses human cognitive capabilities across a wide range of cognitive tasks.”.
E.g.
reason, use strategy, solve puzzles, and make judgments under uncertainty
represent knowledge, including common sense knowledge
plan
learn
communicate in natural language
if necessary, integrate these skills in completion of any given goal
Given we were only shown two examples, its hard to judge whether the ARC test O3 did so well on, incorporated enough of those challenges to meet this AGI requirement. But at least according to Wiki, its making very good progress.
Is the YouTube guys’s definition a better definition of AGI? It may be a better description of day to day human thinking but I’m not convinced that’s what is most important here.