Large language models can train smaller language models and uplevel them quickly. Stanford researchers trained Alpaca 7B, a model fine-tuned from the LLaMA 7B model on 52K instruction-following demonstrations. they used GPT 3.5 to train it. On our preliminary evaluation of single-turn instruction following, Alpaca behaves qualitatively similarly to OpenAI’s text-davinci-003, while being surprisingly small and easy/cheap to reproduce (<600$).

Here is a video describing the work.

Self-Instruct: Aligning Language Model with Self Generated Instructions
Yizhong Wang, Yeganeh Kordi, Swaroop Mishra, Alisa Liu, Noah A. Smith, Daniel Khashabi, Hannaneh Hajishirzi
Large “instruction-tuned” language models (finetuned to respond to instructions) have demonstrated a remarkable ability to generalize zero-shot to new tasks. Nevertheless, they depend heavily on human-written instruction data that is limited in quantity, diversity, and creativity, therefore hindering the generality of the tuned model. We introduce Self-Instruct, a framework for improving the instruction-following capabilities of pretrained language models by bootstrapping off its own generations. Our pipeline generates instruction, input, and output samples from a language model, then prunes them before using them to finetune the original model. Applying our method to vanilla GPT3, we demonstrate a 33% absolute improvement over the original model on Super-NaturalInstructions, on par with the performance of InstructGPT_001, which is trained with private user data and human annotations. For further evaluation, we curate a set of expert-written instructions for novel tasks, and show through human evaluation that tuning GPT3 with Self-Instruct outperforms using existing public instruction datasets by a large margin, leaving only a 5% absolute gap behind InstructGPT_001. Self-Instruct provides an almost annotation-free method for aligning pre-trained language models with instructions, and we release our large synthetic dataset to facilitate future studies on instruction tuning.

Brian Wang

Brian Wang is a Futurist Thought Leader and a popular Science blogger with 1 million readers per month. His blog Nextbigfuture.com is ranked #1 Science News Blog. It covers many disruptive technology and trends including Space, Robotics, Artificial Intelligence, Medicine, Anti-aging Biotechnology, and Nanotechnology.

Known for identifying cutting edge technologies, he is currently a Co-Founder of a startup and fundraiser for high potential early-stage companies. He is the Head of Research for Allocations for deep technology investments and an Angel Investor at Space Angels.

A frequent speaker at corporations, he has been a TEDx speaker, a Singularity University speaker and guest at numerous interviews for radio and podcasts. He is open to public speaking and advising engagements.

3 thoughts on “Big Language Models Can Train Small and Cheap Language Models”

Guest

March 23, 2023 at 11:53 pm

This is what the singularity will look like. AIs start to improve AIs and the speed of development goes exponential. At this point, we still have humans in the loop and human data is the root source of everything. The moment this changes, things will start getting exciting and we may need a plan B.
- Bob
  
  March 24, 2023 at 2:17 pm
  
  Yeah, don’t forget plan c, or the flip side of the coin for B, where the machines worship us or something similar, that could be almost or worse. Wars can end, but worship?
Jean-Baptiste

March 23, 2023 at 8:03 pm

The great discoveries/inventions were attention and transformer NNs back in 2017.

Then it detonated, reinforcing the next iterations even more.

Those discoveries will be hailed like the invention of fire and writing, or go down in infamy (or oblivion), depending how things go in the next few years.

Comments are closed.