OpenAI is scaling up language models and this greatly improves task-agnostic, few-shot performance, sometimes even reaching competitiveness with prior state-of-the-art finetuning approaches. They trained GPT-3, an autoregressive language model with 175 billion parameters, 10x more than any previous non-sparse language model, and test its performance in the few-shot setting. For all tasks, GPT-3 is applied without any gradient updates or fine-tuning, with tasks and few-shot demonstrations specified purely via text interaction with the model. GPT-3 can translate, answer questions, and cloze tasks, as well as several tasks that require on-the-fly reasoning or domain adaptation. It can unscramble words, using a novel word in a sentence, or performing 3-digit arithmetic. They also identify some datasets where GPT-3’s few-shot learning still struggles, as well as some datasets where GPT-3 faces methodological issues related to training on large web corpora.
GPT-3 can generate samples of news articles which human evaluators have difficulty distinguishing from articles written by humans. Generated news articles that are around 500 words long are difficult to distinguish from human-written news articles.
GPT-3 used ten times as much training data as GPT-2.
It was 65% accurate on SAT analogy questions.
SOURCES- Open AI, Arxiv paper
Written by Brian Wang, Nextbigfuture.com
Brian Wang is a Futurist Thought Leader and a popular Science blogger with 1 million readers per month. His blog Nextbigfuture.com is ranked #1 Science News Blog. It covers many disruptive technology and trends including Space, Robotics, Artificial Intelligence, Medicine, Anti-aging Biotechnology, and Nanotechnology.
Known for identifying cutting edge technologies, he is currently a Co-Founder of a startup and fundraiser for high potential early-stage companies. He is the Head of Research for Allocations for deep technology investments and an Angel Investor at Space Angels.
A frequent speaker at corporations, he has been a TEDx speaker, a Singularity University speaker and guest at numerous interviews for radio and podcasts. He is open to public speaking and advising engagements.
The thing is though, do we really need or want an AI to do those things? They just need to do useful things for us, like drive a car or control a robot to do useful tasks that humans don’t always want to do.
Nothing in particular, but in this example, he linked to an early version of the paper, but the ArXiv page shows that v3 of the PDF is available. That’s why ArXiv and similar preprint servers prefer you link to the paper landing page, which allows you to select both first and latest paper, as well as provide means for searching for related papers by topic or author.
Also, it’s generally rude to direct link to files if you can freely access them from a page hosted by the same organization. I could understand if there was no page, or some sort of paywall or obfuscation mechanism making access difficult, but that isn’t the case here.
Yeah, for making propaganda and spamming social networks they will be great, as a figure of speech.
None of this wordsmithing moves AI one iota closer to sentience. For that, you need 4+ billion years of evolution. AI is good for helping to disprove the existence of God, however, showing that top-down “creator” solutions don’t produce sentient beings. We still have more in common with an Amoeba than a Supercomputer – e.g. the instincts, needs, or drives, for hunger, reproduction, escaping from danger, respiration, eliminating/fleeing waste, sense of self, and probably a few other things I didn’t think of. You can’t really “program that in,” it’s all inborn at the cellular level and is an inherent quality of being alive.
There are a few humans who write like boths. Not worth
to communicate with them.
Words are. labels. An understanding of the relations among labels
can only produce a lifeless prose, since there is no knowledge or
understanding whatsoever of what is labelled. IMHO, only useful
application of such engines.can be in translation.
A parameter space so large it’s comparable to the corpus itself and several orders of magnitude more processing than Peter Turney used in 2005 with Latent Relational Analysis: “LRA achieves state-of-the-art results, reaching human-level performance on the analogy questions”
https://arxiv.org/ftp/cs/papers/0508/0508053.pdf
It would appear Musk has discovered a way to save us all from turning into paperclips:
Send the field of AGI research down a rat hole.
And unlike a human journalists getting it right sometimes and totally messed up in others.
Just making the articles up on the spot from existing narratives and whatever keywords have a high click count.
Same as most “human” journalists.
What’s wrong with the pdf?
I don’t think generalized language models are going to get us all the way to human level NLP. I think some specialized language models for specific domains as well as a general world model will help. I also think performance would improve if grammar rules were programmed directly in instead of relying on unsupervised training to get us there.
Also, sockpuppet comment bots. It will remain to be seen whether humans will actually be able to communicate with each other amidst the sea of commentbots.
Brian has a bad habit of direct linking ArXiv paper PDF’s, rather than the paper landing page
https://arxiv.org/abs/2005.14165v3
GPT-3 model code
https://github.com/openai/gpt-3
Generating articles from what?
That is not the same as writing a Tulkin Novel.
Unscrambling words into sentences is good. Does anyone have a github link for code that uses this trick in particular?
ah, the promise of cheaper click bait news, wonderful!