Deep Learning Machine Beats Humans in IQ Test and performs between bachelor and masters degree level

Chinese researchers have built a deep learning machine that outperforms the average human ability to answer verbal reasoning questions for the first time.

They took each word and looked for other words that often appear nearby in a large corpus of text. They then use an algorithm to see how these words are clustered. The final step is to look up the different meanings of a word in a dictionary and then to match the clusters to each meaning.

This can be done automatically because the dictionary definition includes sample sentences in which the word is used in each different way. So by calculating the vector representation of these sentences and comparing them to the vector representation in each cluster, it is possible to match them.

The overall result is a way of recognizing the multiple different senses that some words can have.

They also identify the category of each question so that the computer then knows which answering strategy it should employ. This is straightforward since the questions in each category have similar structures.

Just over 100 years ago, the German psychologist William Stern introduced the intelligence quotient test as a way of evaluating human intelligence. Since then, IQ tests have become a standard feature of modern life and are used to determine children’s suitability for schools and adults’ ability to perform jobs.

These tests usually contain three categories of questions: logic questions such as patterns in sequences of images, mathematical questions such as finding patterns in sequences of numbers and verbal reasoning questions, which are based around analogies, classifications, as well as synonyms and antonyms.

Arxiv – Solving Verbal Comprehension Questions in IQ Test by Knowledge-Powered Word Embedding

They devised an algorithm for solving each question using the standard vector methods but also the multi-sense upgrade they’ve developed.

They compare this deep learning technique with other algorithmic approaches to verbal reasoning tests and also with the ability of humans to do it. For this, they posed the questions to 200 humans gathered via Amazon’s Mechanical Turk crowdsourcing facility along with basic information about their ages and educational background.

And the results are impressive. “To our surprise, the average performance of human beings is a little lower than that of our proposed method,” they say.

Human performance on these tests tends to correlate with educational background. So people with a high school education tend to do least well, while those with a bachelor’s degree do better and those with a doctorate perform best. “Our model can reach the intelligence level between the people with the bachelor degrees and those with the master degrees,” say Huazheng and co.


Intelligence Quotient (IQ) Test is a set of standardized questions designed to evaluate human intelligence. Verbal comprehension questions appear very frequently in IQ tests, which measure human’s verbal ability including the understanding of the words with multiple senses, the synonyms and antonyms, and the analogies among words. In this work, we explore whether such tests can be solved automatically by artificial intelligence technologies, especially the deep learning technologies that are recently developed and successfully applied in a number of fields. However, we found that the task was quite challenging, and simply applying existing technologies (e.g., word embedding) could not achieve a good performance, mainly due to the multiple senses of words and the complex relations among words. To tackle this challenge, we propose a novel framework consisting of three components. First, we build a classifier to recognize the specific type of a verbal question (e.g., analogy, classification, synonym, or antonym). Second, we obtain distributed representations of words and relations by leveraging a novel word embedding method that considers the multi-sense nature of words and the relational knowledge among words (or their senses) contained in dictionaries. Third, for each specific type of questions, we propose a simple yet effective solver based on the obtained distributed word representations and relation representations. According to our experimental results, our proposed framework can not only outperform existing methods for solving verbal comprehension questions but also exceed the average performance of human beings. The results are highly encouraging, indicating that with appropriate uses of the deep learning technologies, we could be a further step closer to the true human intelligence.

Conclusions and Future work
In this paper, we investigated how to automatically solve verbal comprehension questions in the Intelligence Quotient (IQ) Test by using AI technologies, especially the deep learning techniques that are recently developed and successfully applied in text mining and natural language processing. To fulfill the challenging task, especially in terms of the multiple senses of words and the complex relations among words, we proposed a novel framework consisting of three components:
(i) the first component is a classifier that aims to recognize the specific type of a verbal comprehension question;
(ii) the second component leverages a novel deep learning technique to co-learn the representations of both word-sense pairs and relations among words (or their senses);
(iii) the last component is comprised of dedicated solvers, based on the obtained word-sense pair representations and relation representations, for addressing each of the specific types of questions. Experimental results have illustrated that this novel framework can achieve better performance than existing methods for solving verbal comprehension questions and even exceed the average performance of human beings. While this work is a very early attempt to solve IQ Test using AI techniques, the evaluation results are highly encouraging and indicate that, with appropriately leveraging the deep learning technologies, we could be a further step closer to the true human intelligence. In the future, we plan to leverage more types of knowledge from the knowledge graph, such as Freebase , to enhance the power of obtaining word-sense and relation embeddings. Moreover, we will explore new frameworks based on deep learning or other AI techniques to solve other parts of IQ tests beyond verbal comprehension questions.

SOURCES – Arxiv, Technology Review