Using a little-known Google Labs widget, computer scientists from UC San Diego and UCLA have brought common sense to an automated image labeling system. This common sense is the ability to use context to help identify objects in photographs.
For example, if a conventional automated object identifier has labeled a person, a tennis racket, a tennis court and a lemon in a photo, the new post-processing context check will re-label the lemon as a tennis ball.
Google Sets generates lists of related items or objects from just a few examples. If you type in John, Paul and George, it will return the words Ringo, Beatles and John Lennon. If you type “neon” and “argon” it will give you the rest of the noble gasses.
“In some ways, Google Sets is a proxy for common sense. In our paper, we showed that you can use this common sense to provide contextual information that improves the accuracy of automated image labeling systems,” said Belongie.
The computer scientists also highlight other advances they bring to automated object identification. First, instead of doing just one image segmentation, the researchers generated a collection of image segmentations and put together a shortlist of stable image segmentations. This increases the accuracy of the segmentation process and provides an implicit shape description for each of the image regions.
Second, the researchers ran their object categorization model on each of the segmentations, rather than on individual pixels. This dramatically reduced the computational demands on the object categorization model.
Right now, the researchers are exploring ways to extend context beyond the presence of objects in the same image. For example, they want to make explicit use of absolute and relative geometric relationships between objects in an image – such as “above” or “inside” relationships. This would mean that if a person were sitting on top of an animal, the system would consider the animal to be more likely a horse than a dog.
Google sets widget