Safety Testing Self Driving Cars needs to consider the possible Deep Learning Weaknesses

Philip Koopman, professor of Carnegie Mellon Univ., believes the biggest hole in a Federal Automated Policy published late Sept. is in the regulators’ failure to tangle head-on with fundamental difficulties in testing Machine Learning — a problem already known to the scientific/engineering community.

Representativeness of data
Carmakers are building a fake city, for example, in Michigan to test autonomous vehicles. What’s important, though, is whether the test data represents real-world driving conditions?

A highly autonomous vehicle is designed to operate only in a certain designated area such as “driving only in downtown Pittsburgh.” In DoT lingo, this concept is the “Operational Design Domain.” So the question to ask is: Does the data used for that autonomous driving truly represent that particular designated area?

Avoid over-fitting
“Over-fitting” is a well-known problem in Machine Learning. Over-fitting could occur when a model begins to “memorize” training data rather than “learning” to generalize from trends, Koopman explained. An over-trained machines can perfectly predict the training data, but it falters when attempting predictions about new or unseen data.

Regulators should be fully aware of this pitfall and at least raise questions about it.

Validation of testing environment
“You need to make sure simulation is mapped to the real world,” said Koopman. You need to examine whether the representativeness of the testing environment and statistical analysis of validation via testing (including simulation) effectively measure the established safety goals.

Analysis of brittleness
Machine Learning is tough. When an ML-based system encounters something it has never seen before (known as “long tail” or “outlier”), “the ML-system can freak out,” said Koopman.

“When something really unusual happens, people – human drivers – would at least realize something unusual has happened,” said Koopman, and they might try to do something about it, successfully or unsuccessfully. In contrast, a machine might not register this extreme anomaly. It could just keep on going.

This is known as “brittleness” in ML terms. “When that happens, I want to see a plan,” said Koopman, as to how an autonomous vehicle should cope.

‘Safe,’ ‘unsafe,’ ‘not sure’
When an autonomous vehicle is operating in conditions outside the intended ODD (Operational Design Domain), it has to recognize it’s outside its comfort zone. Such an adventure must be deemed invalid.

Koopman said, “I’m not saying that the autonomous vehicle operating outside its envelope is NOT safe, but being ‘unsafe’ and ‘not sure’ aren’t the same thing.”

Noting that there are only three situations – ‘safe,’ ‘unsafe’ and ‘not sure,” he said, “Don’t tell me your autonomous vehicle is safe, when you haven’t done the engineering legwork to determine whether it is safe, or if you aren’t sure.”