Paul Sajda, a professor of biomedical engineering, thinks he’s found a solution to such “information overload” that could revolutionize how vast amounts of visual information are processed—allowing users to riffle through potentially millions of images and home in on what they are looking for in record time. He’s used it successfully for both of these purposes.
It’s called a cortically coupled computer vision (C3Vision) system, and it uses a computer to amplify the power of the quickest and most accurate tool for object recognition ever created: the human brain.
The human brain has the capacity to process very complicated scenes and pick out relevant material before we’re even consciously aware we’re doing so. These “aha” moments of recognition generate an electrical signal that can be picked up using electroencephalography (EEG), the recording of electrical activity along the scalp caused by the firing of neurons in the brain
Sajda tapped Shih-Fu Chang, a professor of electrical engineering and director of the department’s digital, video and multimedia lab, to help him with the project. They designed a device that monitors brain activity as a subject rapidly views a small sample of photographs culled from a much larger database—as many as 10 pictures a second. The device transmits the data to a computer that ranks which photographs elicited the strongest cortical recognition responses. The computer looks for similarities in the visual characteristics of different high-ranking photographs, such as color, texture and the shapes of edges and lines.
Then it scans the much larger database—it could contain upward of 50 million images—and pulls out those that rank high in visual characteristics most highly correlated with the “aha” moments detected by the EEG.
It’s an idea that has already drawn significant interest from the U.S. government. The Defense Advanced Research Projects Agency (DARPA), which pioneered such breakthrough technologies as computer networking, provided $2.4 million to test the device over the next 18 months. Analysts at the National Geospacial-Intelligence Agency will attempt to use the device to look for objects of interest within vast satellite images.
“Their big problem is they have tons of images and not enough eyes to look at them,” Sajda says. “Our device will do a quick triage and allow them to jump from region to region in ways that save time.”
Analyzing satellite imagery is just one application for the device. It could have applications for video games, and could even be used in the burgeoning field of neuro-marketing—since it helps provide key information about what visual characteristics literally light up a person’s brain with interest.
Sajda’s work is “very exciting,” says Michelle Zhou, a senior research manager at IBM’s Thomas J. Watson Research Center and the manager of the department of intelligent multimedia interaction. It demonstrates a step forward in the way machines and human abilities can be combined to enhance human power, she says.
“If anyone is wondering how the scenes in the popular James Cameron movie Avatar would ever become reality, [Paul Sajda’s] work definitely shows the first step toward it,” Zhou says.