OpenAI O3 Crushes Benchmark Tests But is it Intelligence ?

OpenAI O3 crushes benchmark tests and puzzles but is it intelligence ?

There is some substantial costs of $5000 to $500,000 of compute to solve the hard problems.

The AI compute costs will become vastly more efficient and lower cost.

The problems have correct answers but there can be more issues around more compromises and shades of grey problems.

Anything with clear loss functions are solvable with LLM and the OpenAI systems.

3 thoughts on “OpenAI O3 Crushes Benchmark Tests But is it Intelligence ?”

  1. When a binary computer ask’s Who am I”? I’ll consider it “self-aware”, maybe. I’d ask many questions. The more questions IT would ask, the more likely I would say “perhaps”. One resilient hallmark of biological life is it’s ability to adapt. Any child does not need to be “taught this”, they just know… Years ago, I was running a program that had several million lines of code. Then one day, it stopped. Period. The “autopsy” revealed 7 1’s&0s in one line were “wrong”. Seriously?! Yup. My point? Living systems are self-correcting/repairing. Or they don’t stay alive.

    I don’t know how to impart that innate instinct to survive into a non-biological machine. I also must ask, do I want to?

  2. ^Anything with clear loss functions are solvable with LLM and the OpenAI systems.^
    Yes with brute force algos, so openai has no talent left

  3. Of course there is more than one definition of AGI. The YouTube guy basically says AGI should be able to find an optimal path that addresses multiple objectives while the test that was given consisted of cognitive problems that had just one correct answer – so not a real test of AGI.

    However according to Wiki “Artificial general intelligence (AGI) is a type of artificial intelligence (AI) that matches or surpasses human cognitive capabilities across a wide range of cognitive tasks.”.
    E.g.
    reason, use strategy, solve puzzles, and make judgments under uncertainty
    represent knowledge, including common sense knowledge
    plan
    learn
    communicate in natural language
    if necessary, integrate these skills in completion of any given goal

    Given we were only shown two examples, its hard to judge whether the ARC test O3 did so well on, incorporated enough of those challenges to meet this AGI requirement. But at least according to Wiki, its making very good progress.

    Is the YouTube guys’s definition a better definition of AGI? It may be a better description of day to day human thinking but I’m not convinced that’s what is most important here.

Comments are closed.