Vicuna is a new, powerful model based on LLaMa, and trained with GPT-4. Vicuna boasts “90%* quality of OpenAI ChatGPT and Google Bard”. This is unseen quality and performance, all on your computer and offline.
We are honored that a new @MSFTResearch paper adopted our GPT-4 evaluation framework & showed Vicuna’s impressive performance against GPT-4!
The study brings great news for open chatbots: fine-tuning LLM on GPT-4 answers leads to top-notch results. Check their paper out! pic.twitter.com/yHkA9Fp9ic
— lmsys.org (@lmsysorg) April 7, 2023
Timestamps:
0:00 – Explanation
0:22 – What is Vicuna?
1:36 – What is Oobabooga?
2:18 – One-click install script
4:59 – Using the WebUI
5:45 – Basic chat & VRAM usage
6:24 – Guide responses in certain ways (POWERFUL)
7:14 – Preset AI characters/Personas
7:49 – Creating a character/AI persona
8:11 – Limitations
9:02 – Parameters
9:48 – Fine-Tuning
10:05 – Speech-To-Text and Text-To-Speech (tts/stt)
Brian Wang is a Futurist Thought Leader and a popular Science blogger with 1 million readers per month. His blog Nextbigfuture.com is ranked #1 Science News Blog. It covers many disruptive technology and trends including Space, Robotics, Artificial Intelligence, Medicine, Anti-aging Biotechnology, and Nanotechnology.
Known for identifying cutting edge technologies, he is currently a Co-Founder of a startup and fundraiser for high potential early-stage companies. He is the Head of Research for Allocations for deep technology investments and an Angel Investor at Space Angels.
A frequent speaker at corporations, he has been a TEDx speaker, a Singularity University speaker and guest at numerous interviews for radio and podcasts. He is open to public speaking and advising engagements.
There can be security reasons for doing local invocation of an LLM where documents to be summarized, for example, cannot be exposed to the possibility of being viewed. There’s also performance issues whenever sending large amounts of data to be evaluated are sent across a TCP/IP connection.
The post missed mentioning the * attached to “90%*” .. on their website, I read: ” * According to a fun and non-scientific evaluation with GPT-4. Further rigorous evaluation is needed. ”
Comparing LLMs is pretty difficult – seemingly. So what does 90% really mean? Good marketing?
My g,
They literally wrote and linked an entire research paper about it. The answers you seek are within your own power to obtain.
Free is free. If you can afford a computer that rid GPT4 locally, you can afford a Pro subscription
I am gonna guess the reason to run a LLM locally has much more to do with who runs the LLM than it does with the monthly cost of openai’s tools
That really depends on how much you intend to use it. I understand that the multi-agent game simulation took thousands of dollars worth of tokens. Auto-GPT adds up too.
I’ve spent a few hundred on Dalle images. I’ve also generated at least an order of magnitude more images off Stable Diffusion – I would simply not have made those images if I had to pay Dalle rates every time.
Many people have computers with outsized specs for video editing and so on. I am one of them. I can afford a really good computer because I have a business model which directly profits off it. Doesn’t mean that thousands in new costs would hurt me any less than it would any other middle class person.