Fun LoL seeks information regarding mathematical frameworks, architectures, and methods that would help answer questions such as:
- What are the number of examples necessary for training to achieve a given accuracy performance? (e.g., Would a training set with fewer than the 30 million moves that programmers provided to this year’s winning machine have sufficed to beat a Go grand champion? How do you know?)
- What are important trade-offs and their implications? (e.g., size, performance accuracy, processing power considerations)
- How “efficient” is a given learning algorithm for a given problem?
- How close is the expected achievable performance of a learning algorithm compared to what can be achieved at the limit?
- What are the effects of noise and error in the training data?
- What are the potential gains possible due to the statistical structure of the model generating the data?
“We’ve seen advances in machine learning and AI enabling computers to beat human champions playing games like Jeopardy, chess, and most recently Go, the ancient Chinese strategy game,” said Reza Ghanadan, DARPA program manager. “What’s lacking, however, is a fundamental theoretical framework for understanding the relationships among data, tasks, resources, and measures of performance—elements that would allow us to more efficiently teach tasks to machines and allow them to generalize their existing knowledge to new situations. With Fun LoL we’re addressing how the quest for the ultimate learning machine can be measured and tracked in a systematic and principled way.”
As it stands now with machine learning, Ghanadan noted, even a small change in task often requires programmers to create an entirely new machine teaching process. “If you slightly tweak a few rules of the game Go, for example, the machine won’t be able to generalize from what it already knows. Programmers would need to start from scratch and re-load a data set on the order of tens of millions of possible moves to account for the updated rules.”
These kinds of challenges are especially pressing for the Department of Defense, whose specialized systems typically don’t have large training sets to begin with and can’t afford to rely on trial and error methods, which come at high cost. Additionally, defense against complex threats requires machines to adapt and learn quickly, so it is important that they be able to generalize creatively from previously learned concepts.
As an historical example of what Fun LoL is trying to achieve, Ghanadan pointed to a mathematical construct called Shannon’s theorem that helped revolutionize communications theory. Shannon's theorem established a mathematical framework showing that for any given communication channel, it is possible to communicate information nearly error-free up to a computable maximum rate through that channel. The theorem addresses tradeoffs in bandwidth, source data distribution, noise, methods of communication transmission, error correction coding, measures of information, and other factors that can affect determinations of communications efficiency. “Shannon’s theorem provided the fundamental basis that catalyzed the widespread operational application of modern digital and wireless communications,” Ghanadan said. “The goal of Fun LoL is to achieve a similar mathematical breakthrough for machine learning and AI.”
DARPA’s Fun LoL is seeking information that could inform novel approaches to this problem. Technology areas that may be relevant include information theory, computer science theory, statistics, control theory, machine learning, AI, and cognitive science.
RESEARCH AREAS OF INTEREST
Foundational general theory: Articulation of a general mathematical framework that, independent of any particular machine learning method, provides quantifiable and generalizable measures of learning and fundamental limits across supervised, unsupervised, and reinforcement learning settings. Such a framework may need to consider a number of factors such as:
• The type, quality, and relevance of the signals available from the data (e.g., labels, distance/similarity measures, rewards);
• The structure, complexity, and observability of the target task/concept space; and
• Task performance metrics (e.g., accuracy, speed, computational complexity, sample complexity) as well as the interactions and trade-offs among such factors.
Applications of theory: Application of a general framework to existing machine learning methods in order to characterize the capabilities or performance envelopes of current techniques in each of the following categories:
• Supervised learning from example input/output pairs (e.g., deep neural networks, decision trees, support vector machines, random forests, etc.);
• Unsupervised learning from only input data (e.g., clustering, topic models, principal component analysis, community detection, etc.); and
• Reinforcement (policy) learning from reward/penalty signals (e.g., Q-learning, direct policy search, etc.).
This Research Area should also address questions, such as:
• What are the performance limits of methods/architectures that combine the algorithms from the above categories?
• For what classes of problems are these algorithms optimal or near the performance limit?
SOURCES - DARPA