Fear the AI Who Has Practiced Next Word Guessing 10 Trillion Times

Martial Art Legence and Philosopher, Bruce Lee, famously said, I fear not the man who has practiced 10,000 kicks once, but I fear the man who has practiced one kick 10,000 times.

Many now fear the AI, ChatGPT-4. ChatGPT grew from the original task of guessing the next word in a sentence, but it has done this task trillions of times. I fear not the AI who has practiced 10 trillion guesses once, but I fear the AI who has practiced next word guessing 10 trillion times.

The nature of intelligence is being revealed with the success of ChatGPT and other Large Language Models. This AI success is showing some aspect of truth to the infinite monkey theorem. The infinite monkey theorem states that a monkey hitting keys at random on a typewriter keyboard for an infinite amount of time will almost surely type any given text, such as the complete works of William Shakespeare.

The generative AIs are learning reasoning and logic as emergent capabilities from the need to do more to master its task of next word guessing. ChatGPT is more than just infinite monkeys.. ChatGPT was trained on at least 570GB of data. It learned from the large amounts of data. It will be interesting to say the least what emerges at 6000 gigabytes of text and larger amounts of sound and video and other data formats. A study conducted by Google Books found that there have been 129,864,880 books published since the invention of Gutenberg’s printing press in 1440. At an average of 50,000 words per book, that is about 6.5 trillion words in total. ChatGPT may be currently outputting 310 million wpm (words per minute) or about 450 billion words per day.

Stephen Wolfram, who created WolframAlpha, described how ChatGPT works. WolframAlpha is an answer engine developed by Wolfram Research. It answers factual queries by computing answers from externally sourced data. WolfamAlpha answers most any question in math and science. It was built upon an earlier product Wolfram Mathematica, a technical computing platform.

Wolfram describes how ChatGPT discovered logic, reasoning and grammar from mastering the task of next-word guessing. Getting extremely good at next-word guessing required encoding and discovering logic, reasoning and math.

We need to understand how much of this comes from being able to develop a synthetic ability to basically read and reason. Create some structures for context and for connecting dots. Ingesting descriptions including all of the “answers” in Wikipedia. All of the answers in every teacher guide. This form of AI currently has limitations in how much multilevel reasoning it can achieve.

The visible flaws in hallucination are where it’s connecting the dots go off track. Once one flaw occurs then they compound.

This goes to the need to have many people and other systems tune the AI and calibrate answers.

The systems are highly useful for massive summarization of existing works of knowledge created by people and also using what it has ingested and structured to mimic other works. It can also be a DeeJay and remix what it has.

It is tireless in ingesting large amounts of all kinds of information.

It is good at coding and programming because it has taken all of the written code in Github. Coding is very structured and involves a lot of remixing many mini-algorithms or blocks of old code and modifying it.

The world has GPS (global positioning systems) based upon satellites providing precise locations. The US military has tried to make portable versions of GPS. The systems start precisely from known fixed locations but then as they are moved errors increase. If there is no way to re-acquire precise known locations then the errors get too large.

I believe these analogies are a way to think about both the power and limitations of generative AI. IF my understanding is correct then it simplifies an approximation of the problems and what is being done to advance capabilities.

ChatGPT is better if it is pulling answers from its training data and can make one or two connections for answers within its training data but the more layers of connections and reasoning then the more errors accumulate.

This highlights the importance of better and larger training datasets.

13 thoughts on “Fear the AI Who Has Practiced Next Word Guessing 10 Trillion Times”

  1. I asked ChatGPT-3 the following:

    In the creation of ChatGPT were fiction and non-fiction books differentiated?

    Answer:
    No, fiction and non-fiction books were not specifically differentiated in the creation of ChatGPT.

    ChatGPT was trained on a large corpus of text from the internet, including books, articles, and other sources. The algorithm is designed to learn patterns in language and generate responses based on the input it receives.

    During the training process, the algorithm is not explicitly told which texts are fiction or non-fiction. Instead, it learns to recognize patterns and make predictions based on the language it has seen in the training data, regardless of whether that language comes from a work of fiction or non-fiction.

    As a result, ChatGPT has the ability to generate responses on a wide range of topics, including both fictional and non-fictional subjects, and can draw on its understanding of language to provide insightful and informative answers.

    My comment: Because ChatGPT and other LLMs are unable to separate fiction from fact, they are unreliable. In the 60s, programmers used to warn that garbage in, resulted in garbage out. Now garbage is intermingled with non-garbage, and all results must be scrutinized carefully.

    It is curious to me that all the AI commentators have not discussed the contents of LLMs.

  2. Not to fear the AI that has practiced next word guessing 10 trillion times but rather the psychopath user without empathy, capable of evil beyond human ken.

    Where I have seen the potential for extremely dangerous AI to arise is in the hunt for terrorists where the surveillance system has access to phone, email and other communications with the capacity for close scrutiny of human facial expressions and in some cases physiological indicators. Such a “super detective” could start using early indicators of anti-social behavior in time treating dissidents as “slow schizophrenia in Soviet psychiatry. Arming such a super detective with the capacity to take action to prevent a future terrorist incident gradually the super detective could start eliminating “terrorists’ before they commit their acts. This could explode into a Stalinist purge frenzy widening the circle of elimination of “potential terrorists” to much of human society.

    Who would be positioned to place guardrails on such an AI that could be taught ways to hide itself in global internet networks?

  3. But logic is not the add up of next word guessing, no matter big the data set that was used to train it is. Any AI expert says this. This is where the Chinese cyborgs fail in grasping…

  4. Yes but interacting with reality is the recalibration mechanism – this is evolution

    Also fundamental to cognitive science

  5. This kind of underscores that this approach to AI might not be good if you want an AI that can reason about things completely unrelated to any large training set. It’s like a polynomial best fit, really: Valid inside the range of the data fitted to, but outside that range you’re taking your chances.

    • Not untrue, but the human mind also works like that. If we are familiar with a subject matter, we can reason well within its domain, but not in unfamiliar territory. Even if solving the problem requires the exact same logic mechanism. But some individuals can apply methods from a known field to unknown fields… by studying the unknown problem… and making it familiar to them

      • The way I’d put it is that actual “intelligence” is a kind of exception handling mechanism, that kicks in when you have no relevant experience to base your decisions on. It doesn’t take you direct to the answer, but allows you to explore your way to it. It’s really expensive and risky to use, so nature avoids its use wherever possible.

        For humans, this exception handling mechanism has to kick in fairly frequently, and is necessarily pretty well developed, because we only have so long to accumulate experience, and then everything resets. We’re continually running into novel situations outside our previous experience, especially when young. Not so much when we’re older.

        These next word prediction AI’s have been provided with absolutely insane amounts of experience, the training set, but have no exception handling capacity at all, really. They’re the way a human would be if you could live to 10,000 years old, but then were brain damaged, and lost our ability to reason about novel things: You’d do great most of the time, because not much is going to be novel after 10,000 years…

        • Yes, when you see an unfamiliar animal hopefully your brain connects to previous experiences and you can abstract to other and previously-seen dangerous animals.

        • This has to do with how time seems to pass much faster as we get older. Our brains store unique patterns, and if a pattern is not unique, it no longer registers. For example, perhaps you’ve pulled out your chair in the dining room at home for ten thousand dinners, but your memories of doing so are kind of a composite, you can’t really remember the individual times you did so . . . except for that time you accidently set the chair leg down on the cat’s tail.

          Driving across Utah forty years ago, I once lost two hours of my life this way. I was driving down some mountains to a flat plain, and suddenly I became aware that I had driven up the mountains on the far side of the plain. Sometimes called road hypnosis, it is apparently fairly common, although it has never happened to me again. I was not asleep, I was perfectly conscious and fully functional, just nothing in that time period, not even my stray thoughts, apparently, was significant enough that my brain kept a memory of it.

          There is a book called “on Intelligence” by Jeff Hawkins, the computer guy that invented the Palm Pilot, then went and got a degree in neurology so he could work on this very problem. Easy read, but fascinating, and ties right into all of this.

          • Road hypnosis? I get that ALL the time. Especially on commute where, as said, I’d done exactly the same route a thousand times.
            I know I’ve gotten used to a commute if I can zone out and wake at my destination while riding a push bike. Usually takes me about 6 months of daily rides to get that.

  6. From an article on “autonomous’ AI agents:

    “Since the launch of OpenAI’s GPT-4 API last month to beta testers, a loose group of developers have been experimenting with making agent-like (“agentic”) implementations of the AI model that attempt to carry out multistep tasks with as little human intervention as possible. These homebrew scripts can loop, iterate, and spin off new instances of an AI model as needed.”

    See:

    https://arstechnica.com/information-technology/2023/04/hype-grows-over-autonomous-ai-agents-that-loop-gpt-4-outputs/

Comments are closed.