These modified pictures are called adversarial images, and they are a significant threat. “An adversarial example for the face recognition domain might consist of very subtle markings applied to a person’s face, so that a human observer would recognize their identity correctly, but a machine learning system would recognize them as being a different person,” say Alexey Kurakin and Samy Bengio at Google Brain and Ian Goodfellow from OpenAI, a nonprofit AI research company.
Because machine vision systems are so new, little is known about adversarial images. Nobody understands how best to create them, how they fool machine vision systems, or how to protect against this kind of attack.
Today, that starts to change thanks to the work of Kurakin and co, who have begun to study adversarial images systematically for the first time. Their work shows just how vulnerable machine vision systems are to this kind of attack.
The performance in these tests is measured by counting how often the algorithm has the correct classification in its top 5 answers or even its top 1 answer (its so-called top 5 accuracy or top 1 accuracy) or how often it does not have the correct answer in its top 5 or top 1 (its top 5 error rate or top 1 error rate).
One of the best machine vision systems is Google’s Inception v3 algorithm, which has a top 5 error rate of 3.46 percent. Humans doing the same test have a top 5 error rate of about 5 percent, so Inception v3 really does have superhuman abilities.
Kurakin and co created a database of adversarial images by modifying 50,000 pictures from ImageNet in three different ways. Their methods exploit the idea that neural networks process information to match an image with a particular classification. The amount of information this requires, called the cross entropy, is a measure of how hard the matching task is.
Their first algorithm makes a small change to an image in a way that attempts to maximize this cross entropy. Their second algorithm simply iterates this process to further alter the image.
These algorithms both change the image in a way that makes it harder to classify correctly. “These methods can result in uninteresting misclassifications, such as mistaking one breed of sled dog for another breed of sled dog,” they say.
Their final algorithm has much cleverer approach. This modifies an image in way that directs the machine vision system into misclassifying it in a specific way, preferably one that is least like the true class. “The least-likely class is usually highly dissimilar from the true class, so this attack method results in more interesting mistakes, such as mistaking a dog for an airplane,” say Kurakin and co.
SOURCES- Technology Review