Alibaba has developed an artificial intelligence model that scored better than humans in a Stanford University reading and comprehension test. Alibaba asked the AI to provide exact answers to more than 100,000 questions comprising a quiz that’s considered one of the world’s most authoritative machine-reading gauges. The model developed by Alibaba’s Institute of Data Science of Technologies scored 82.44, edging past the 82.304 that rival humans achieved.
Microsoft achieved a similar feat, scoring 82.650 on the same test, but those results were finalized a day after Alibaba’s.
Stanford Question Answering Dataset (SQuAD) is a new reading comprehension dataset, consisting of questions posed by crowdworkers on a set of Wikipedia articles, where the answer to every question is a segment of text, or span, from the corresponding reading passage. With 100,000+ question-answer pairs on 500+ articles, SQuAD is significantly larger than previous reading comprehension datasets.