How Do the Best LLMs Compare and Rank Today

HuggingFace has rankings of the best large langauge models based upon the votes of over 70,000 users. OpenAIS GPT-4 is still number one and is followed by three versions of Anthropics Claude. Fifth is GPT3.5 Turbo and then the first Open Source model in the list Vicuna-33B. Seventh is Meta’s LLAMA2 70b Chat. The HuggingfAce …

Read more

Elon Musk Made X.AI for Safe Artificial General Intelligence

Walter Isaacson biography 94th chapter is about AI for humans. This describes the origin, purpose and some details about X.AI. Elon Musk was musing that he was certain he could make Twitter (now X) into the largest financial institution in the world. However, he wanted to spend his time on more meaningful things. One of …

Read more

ChatGPT for the Enterprise Amplifies Office Productivity

OpenAI released ChatGPT for the enterprise that aspires to be a general AI assistant for all office workers. OpenAI also announced that it is passing the $1 billion per year in revenue runrate ($80 million per month now). This is about triple the level of revenue 12 months ago. 80% statistic of Fortune 500 companies …

Read more

HuggingFace Launches Open HuggingChat and OpenAI Will Offer ChatGPT Business

HuggingChat is a new open-source 30B chatbot alternative to ChatGPT. There will likely soon be be HuggingChat Apps. Jim Fan believes HuggingFace is in a great position to become the Android App Store. HuggingFace may have an edge over OpenAI: the apps can be other multimodal models already on HuggingFace. HuggingChat, the open-source 30B chatbot …

Read more

Evaluating Large Language Models

ChatGPT, GPT-4 are Large Language Models (LLM). There are four major aspects of LLMs pre-training, adaptation tuning, utilization, and capacity evaluation. Here is one of the new summaries of the available resources for developing LLMs and issues for future directions. Chain-of-Thought (CoT) is an improved prompting strategy to boost the performance of LLMs on complex …

Read more

Fear the AI Who Has Practiced Next Word Guessing 10 Trillion Times

Martial Art Legence and Philosopher, Bruce Lee, famously said, I fear not the man who has practiced 10,000 kicks once, but I fear the man who has practiced one kick 10,000 times. Many now fear the AI, ChatGPT-4. ChatGPT grew from the original task of guessing the next word in a sentence, but it has …

Read more

Emergence and Reasoning in Large Language Models

Emergent capabilities are abilities that are not present in smaller models but are present in larger models. This is discussed in the video below by Jason Wei, a Google AI researcher. I had a prior article that discussed the summary from Alan Thompson on what capabilities emerged at what point for the large language models. …

Read more

Generative AI Agents Simulate Real Human Behavior

Stanford Researchers have used Generative AI to simulate believable human behavior in a simulated world. Generative agents wake up, cook breakfast, and head to work; artists paint, while authors write; they form opinions, notice each other, and initiate conversations; they remember and reflect on days past as they plan the next day. To enable generative …

Read more

ChatGPT AI Projected to Impact Over 300 Million Jobs and World GDP

Goldman Sachs has an analysis that ChatGPT and other generative AI could spark a productivity boom that could automate 25% of jobs in North America and Europe and raise annual global GDP by 7% over a 10-year period. The new AI could lift productivity growth by 1.5 percentage points over a 10-year period. This is …

Read more

Vicuna is the Current Best Open Source AI Model for Local Computer Installation

Vicuna is a new, powerful model based on LLaMa, and trained with GPT-4. Vicuna boasts “90%* quality of OpenAI ChatGPT and Google Bard”. This is unseen quality and performance, all on your computer and offline. Oobabooga is a UI for running Large Language Models for Vicuna and many other models like LLaMA, llama.cpp, GPT-J, Pythia, …

Read more

GPT4 With Reflexion Has a Superior Coding Score

A slightly improved Reflexion-based GPT-4 agent achieves state-of-the-art pass@1 results (88%) on HumanEval, outperforming GPT-4 (67.0%) and CodeT: Code Generation with Generated Tests (65.8%), which were the previous state-of-the-art standards. Relaxing Success Evaluation By using Reflexion to iteratively refine the current implementation, researchers are shifting the “accuracy bottleneck” from correct syntactic and semantic code generation …

Read more