Vicuna is a new, powerful model based on LLaMa, and trained with GPT-4. Vicuna boasts “90%* quality of OpenAI ChatGPT and Google Bard”. This is unseen quality and performance, all on your computer and offline.

Oobabooga is a UI for running Large Language Models for Vicuna and many other models like LLaMA, llama.cpp, GPT-J, Pythia, OPT, and GALACTICA. The github for oobabooga is here.

We are honored that a new @MSFTResearch paper adopted our GPT-4 evaluation framework & showed Vicuna’s impressive performance against GPT-4!

The study brings great news for open chatbots: fine-tuning LLM on GPT-4 answers leads to top-notch results. Check their paper out! pic.twitter.com/yHkA9Fp9ic

— lmsys.org (@lmsysorg) April 7, 2023

Timestamps:
0:00 – Explanation
0:22 – What is Vicuna?
1:36 – What is Oobabooga?
2:18 – One-click install script
4:59 – Using the WebUI
5:45 – Basic chat & VRAM usage
6:24 – Guide responses in certain ways (POWERFUL)
7:14 – Preset AI characters/Personas
7:49 – Creating a character/AI persona
8:11 – Limitations
9:02 – Parameters
9:48 – Fine-Tuning
10:05 – Speech-To-Text and Text-To-Speech (tts/stt)

Brian Wang

Brian Wang is a Futurist Thought Leader and a popular Science blogger with 1 million readers per month. His blog Nextbigfuture.com is ranked #1 Science News Blog. It covers many disruptive technology and trends including Space, Robotics, Artificial Intelligence, Medicine, Anti-aging Biotechnology, and Nanotechnology.

Known for identifying cutting edge technologies, he is currently a Co-Founder of a startup and fundraiser for high potential early-stage companies. He is the Head of Research for Allocations for deep technology investments and an Angel Investor at Space Angels.

A frequent speaker at corporations, he has been a TEDx speaker, a Singularity University speaker and guest at numerous interviews for radio and podcasts. He is open to public speaking and advising engagements.

6 thoughts on “Vicuna is the Current Best Open Source AI Model for Local Computer Installation”

Paul Duncanson

April 14, 2023 at 5:16 pm

There can be security reasons for doing local invocation of an LLM where documents to be summarized, for example, cannot be exposed to the possibility of being viewed. There’s also performance issues whenever sending large amounts of data to be evaluated are sent across a TCP/IP connection.
ErlandW

April 9, 2023 at 4:10 pm

The post missed mentioning the * attached to “90%*” .. on their website, I read: ” * According to a fun and non-scientific evaluation with GPT-4. Further rigorous evaluation is needed. ”
Comparing LLMs is pretty difficult – seemingly. So what does 90% really mean? Good marketing?
- Nook
  
  April 10, 2023 at 4:11 pm
  
  My g,
  
  They literally wrote and linked an entire research paper about it. The answers you seek are within your own power to obtain.
- ChrisJ
  
  April 11, 2023 at 7:07 am
  
  Free is free. If you can afford a computer that rid GPT4 locally, you can afford a Pro subscription
  - sc
    
    April 14, 2023 at 2:53 am
    
    I am gonna guess the reason to run a LLM locally has much more to do with who runs the LLM than it does with the monthly cost of openai’s tools
  - Moriarty
    
    April 19, 2023 at 5:47 pm
    
    That really depends on how much you intend to use it. I understand that the multi-agent game simulation took thousands of dollars worth of tokens. Auto-GPT adds up too.
    
    I’ve spent a few hundred on Dalle images. I’ve also generated at least an order of magnitude more images off Stable Diffusion – I would simply not have made those images if I had to pay Dalle rates every time.
    
    Many people have computers with outsized specs for video editing and so on. I am one of them. I can afford a really good computer because I have a business model which directly profits off it. Doesn’t mean that thousands in new costs would hurt me any less than it would any other middle class person.

Comments are closed.