DARPA seeks scalable and dynamic testing for learning robots

DARPA announced a new research program called Assured Autonomy that aims to advance the ways computing systems can learn and evolve to better manage variations in the environment and enhance the predictability of autonomous systems like driverless vehicles and unmanned aerial vehicles (UAVs).

The assurance approaches today are predicated on the assumption that the systems, once deployed, do not learn and evolve.”

One approach to assurance of autonomous systems that has recently garnered attention, particularly in the context of self-driving vehicles, is based on the idea of “equivalent levels of safety,” i.e., the autonomous system must be at least as safe as a comparable human-in-the-loop system that it replaces. The approach compares known rates of safety incidents of manned systems—number of accidents per thousands of miles driven—and conducting physical trials to determine the corresponding incident rate for autonomous systems. Studies and analyses indicate, however, that assuring safety of autonomous systems in this manner alone is prohibitive, requiring millions of physical trials, perhaps spanning decades. Simulation techniques have been advanced to reduce the needed number of physical trials, but offer very little confidence, particularly with respect to low-probability, high-consequence events.

In contrast to prescriptive, process-oriented standards for safety and assurance, a goal-oriented approach, such as the one espoused by Neema, is arguably more suitable for systems that learn, evolve, and encounter operational variations. In the course of Assured Autonomy program, researchers will aim to develop tools that provide foundational evidence that a system can satisfy explicitly stated functional and safety goals, resulting in a measure of assurance that can also evolve with the system.

DARPA Goal – Develop rigorous design and analysis technologies for continual assurance of learning-enabled autonomous systems, in order to guarantee safety properties in adversarial environment

Program objectives

• Increase scalability of design-time assurance
• What is the baseline capability of the proposed methods, in terms of the hybrid state-space and number and complexity of learning-enabled components
• How do you plan to scale up by an order of magnitude?
• How will you characterize the tradeoffs between fidelity of your modeling abstractions and scalability of the verification approach.
• Reduce overhead of operation-time assurance
• What is the baseline overhead of the operation-time assurance monitoring techniques?
• How do you plan to minimize it to be below 10% of the nominal system resource utilization?
• Scale up dynamic assurance
• What is the size and scale of dynamic assurance case that can be developed and
dynamically evaluated with your tools?
• Reduce trials to assurance
• How will your approach quantifiably reduce the need for statistical testing?

The program has 3 phases and all three phases should complete within 4 years.

The U.S. Army Robotics and Autonomous Systems (RAS) strategy report for 2015-2040 identifies a range of capability objectives, including enhanced situational awareness, cognitive workload reduction, force protection, cyber defense, logistics, etc, that rely on autonomous systems and higher levels of autonomy.

Several factors impede the deployment and adoption of autonomous systems:

1. In the absence of an adequately high level of autonomy that can be relied upon, substantial operator involvement is required, which not only severely limits operational gains, but creates significant new challenges in the areas of human-machine interaction and mixed initiative control.
2. Achieving higher levels of autonomy in uncertain, unstructured, and dynamic environments, on the other hand, increasingly involves data-driven machine learning techniques with many open systems science and systems engineering challenges.
3. Machine learning techniques widely used today are inherently unpredictable and lack the necessary mathematical framework to provide guarantees on correctness, while DoD applications that depend on safe and correct operation for mission success require predictable behavior and strong assurance.