The Elon Musk funded OpenAI non-profit has created a breakthrough system for writing high-quality text. It can write text, performs basic reading comprehension, machine translation, question answering, and summarization and all without task-specific training.
The system is able to take a few sentences of sample writing and then produce a multi-paragraph article in the style and context of the sample. This capability would let AI’s to impersonate the writing style of any person from previous writing samples.
GPT-2, is a 1.5 billion parameter Transformer that achieves state of the art results on 7 out of 8 tested language modeling datasets in a zero-shot setting, yet still simplifies (or in AI term underfits) their database called WebText. Samples from the model reflect these improvements and contain coherent paragraphs of text. These findings suggest a promising path towards building language processing systems which learn to perform tasks from their naturally occurring demonstrations.
They created a new web scrape which emphasizes document quality. They only used outbound links from
Reddit which had a karma score of 3 or higher. This indicates the quality of the articles to be interesting, educational, or just funny. The resulting dataset (WebText) contains text of 45 million Reddit sourced articles.
The system can write text according to the desired style and context.
AI Ghostwriting and Better Translations and Speech Recognition
Many good applications of this system will emerge as follows:
* AI writing assistants
* More capable dialogue agents
* Unsupervised translation between languages
* Better speech recognition systems
Fake News on Steroids and Super-spam
Obvious bad applications can emerge as follows:
* Generate misleading news articles
* Impersonate others online
* Automate the production of abusive or faked content to post on social media
* Automate the production of spam/phishing content
OpenAI was worried about more fake news and spam, so they limited the release of the system. Other AI groups will still be able to replicate the work, a few groups will probably be able to copy it in the next year.
AI like this will become common leading to a ton of fake news and spam AI, to a potential widespread in 1 to 2 years.
Written By Christina Wong. nextbigfuture.com
Product Leader who is passionate about turning “what’s impossible” into innovative realities. Building a new digital SaaS platform to transform the $12+ billion CiscoCustomer Experience business.