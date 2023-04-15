Martial Art Legence and Philosopher, Bruce Lee, famously said, I fear not the man who has practiced 10,000 kicks once, but I fear the man who has practiced one kick 10,000 times.

Many now fear the AI, ChatGPT-4. ChatGPT grew from the original task of guessing the next word in a sentence, but it has done this task trillions of times. I fear not the AI who has practiced 10 trillion guesses once, but I fear the AI who has practiced next word guessing 10 trillion times.

The nature of intelligence is being revealed with the success of ChatGPT and other Large Language Models. This AI success is showing some aspect of truth to the infinite monkey theorem. The infinite monkey theorem states that a monkey hitting keys at random on a typewriter keyboard for an infinite amount of time will almost surely type any given text, such as the complete works of William Shakespeare.

The generative AIs are learning reasoning and logic as emergent capabilities from the need to do more to master its task of next word guessing. ChatGPT is more than just infinite monkeys.. ChatGPT was trained on at least 570GB of data. It learned from the large amounts of data. It will be interesting to say the least what emerges at 6000 gigabytes of text and larger amounts of sound and video and other data formats. A study conducted by Google Books found that there have been 129,864,880 books published since the invention of Gutenberg’s printing press in 1440. At an average of 50,000 words per book, that is about 6.5 trillion words in total. ChatGPT may be currently outputting 310 million wpm (words per minute) or about 450 billion words per day.

Stephen Wolfram, who created WolframAlpha, described how ChatGPT works. WolframAlpha is an answer engine developed by Wolfram Research. It answers factual queries by computing answers from externally sourced data. WolfamAlpha answers most any question in math and science. It was built upon an earlier product Wolfram Mathematica, a technical computing platform.

Wolfram describes how ChatGPT discovered logic, reasoning and grammar from mastering the task of next-word guessing. Getting extremely good at next-word guessing required encoding and discovering logic, reasoning and math.

We need to understand how much of this comes from being able to develop a synthetic ability to basically read and reason. Create some structures for context and for connecting dots. Ingesting descriptions including all of the “answers” in Wikipedia. All of the answers in every teacher guide. This form of AI currently has limitations in how much multilevel reasoning it can achieve.

The visible flaws in hallucination are where it’s connecting the dots go off track. Once one flaw occurs then they compound.

This goes to the need to have many people and other systems tune the AI and calibrate answers.

The systems are highly useful for massive summarization of existing works of knowledge created by people and also using what it has ingested and structured to mimic other works. It can also be a DeeJay and remix what it has.

It is tireless in ingesting large amounts of all kinds of information.

It is good at coding and programming because it has taken all of the written code in Github. Coding is very structured and involves a lot of remixing many mini-algorithms or blocks of old code and modifying it.

The world has GPS (global positioning systems) based upon satellites providing precise locations. The US military has tried to make portable versions of GPS. The systems start precisely from known fixed locations but then as they are moved errors increase. If there is no way to re-acquire precise known locations then the errors get too large.

I believe these analogies are a way to think about both the power and limitations of generative AI. IF my understanding is correct then it simplifies an approximation of the problems and what is being done to advance capabilities.

ChatGPT is better if it is pulling answers from its training data and can make one or two connections for answers within its training data but the more layers of connections and reasoning then the more errors accumulate.

This highlights the importance of better and larger training datasets.