Using functional Magnetic Resonance Imaging (fMRI) and computational models, UC Berkeley researchers have succeeded in decoding and reconstructing people’s dynamic visual experiences – in this case, watching Hollywood movie trailers.
As yet, the technology can only reconstruct movie clips people have already viewed. However, the breakthrough paves the way for reproducing the movies inside our heads that no one else sees, such as dreams and memories, according to researchers.
“This is a major leap toward reconstructing internal imagery,” said Professor Jack Gallant, a UC Berkeley neuroscientist and coauthor of the study published online today (Sept. 22) in the journal Current Biology. “We are opening a window into the movies in our minds.”
Eventually, practical applications of the technology could include a better understanding of what goes on in the minds of people who cannot communicate verbally, such as stroke victims, coma patients and people with neurodegenerative diseases.
It may also lay the groundwork for brain-machine interface so that people with cerebral palsy or paralysis, for example, can guide computers with their minds.
However, researchers point out that the technology is decades from allowing users to read others’ thoughts and intentions, as portrayed in such sci-fi classics as “Brainstorm,” in which scientists recorded a person’s sensations so that others could experience them
Previously, Gallant and fellow researchers recorded brain activity in the visual cortex while a subject viewed black-and-white photographs. They then built a computational model that enabled them to predict with overwhelming accuracy which picture the subject was looking at.
In their latest experiment, researchers say they have solved a much more difficult problem by actually decoding brain signals generated by moving pictures.
“Our natural visual experience is like watching a movie,” said Shinji Nishimoto, lead author of the study and a post-doctoral researcher in Gallant’s lab. “In order for this technology to have wide applicability, we must understand how the brain processes these dynamic visual experiences.”
Nishimoto and two other research team members served as subjects for the experiment, because the procedure requires volunteers to remain still inside the MRI scanner for hours at a time.
They watched two separate sets of Hollywood movie trailers, while fMRI was used to measure blood flow through the visual cortex, the part of the brain that processes visual information. On the computer, the brain was divided into small, three-dimensional cubes known as volumetric pixels, or “voxels.”
“We built a model for each voxel that describes how shape and motion information in the movie is mapped into brain activity,” Nishimoto said.
The brain activity recorded while subjects viewed the first set of clips was fed into a computer program that learned, second by second, to associate visual patterns in the movie with the corresponding brain activity.
Brain activity evoked by the second set of clips was used to test the movie reconstruction algorithm. This was done by feeding 18 million seconds of random YouTube videos into the computer program so that it could predict the brain activity that each film clip would most likely evoke in each subject.
Finally, the 100 clips that the computer program decided were most similar to the clip that the subject had probably seen were merged to produce a blurry yet continuous reconstruction of the original movie.
Reconstructing movies using brain scans has been challenging because the blood flow signals measured using fMRI change much more slowly than the neural signals that encode dynamic information in movies, researchers said. For this reason, most previous attempts to decode brain activity have focused on static images.
“We addressed this problem by developing a two-stage model that separately describes the underlying neural population and blood flow signals,” Nishimoto said.
Reconstructing visual experiences from brain activity evoked by natural movies.
Quantitative modeling of human brain activity can provide crucial insights about cortical representations and can form the basis for brain decoding devices. Recent functional magnetic resonance imaging (fMRI) studies have modeled brain activity elicited by static visual patterns and have reconstructed these patterns from brain activity. However, blood oxygen level-dependent (BOLD) signals measured via fMRI are very slow, so it has been difficult to model brain activity elicited by dynamic stimuli such as natural movies. Here we present a new motion-energy encoding model that largely overcomes this limitation. The model describes fast visual information and slow hemodynamics by separate components. We recorded BOLD signals in occipitotemporal visual cortex of human subjects who watched natural movies and fit the model separately to individual voxels. Visualization of the fit models reveals how early visual areas represent the information in movies. To demonstrate the power of our approach, we also constructed a Bayesian decoder by combining estimated encoding models with a sampled natural movie prior. The decoder provides remarkable reconstructions of the viewed movies. These results demonstrate that dynamic brain activity measured under naturalistic conditions can be decoded using current fMRI technology.
While our computational models of some cortical visual areas perform well, they do not perform well when used to decode activity in other parts of the brain. A better understanding of the processing that occurs in parts of the brain beyond visual cortex (e.g. parietal cortex, frontal cortex) will be required before it will be possible to decode other aspects of human experience. What are the future applications of this technology? This study was not motivated by a specific application, but was aimed at developing a computational model of brain activity evoked by dynamic natural movies. That said, there are many potential applications of devices that can decode brain activity. In addition to their value as a basic research tool, brain-reading devices could be used to aid in diagnosis of diseases (e.g., stroke, dementia); to assess the effects of therapeutic interventions (drug therapy, stem cell therapy); or as the computational heart of a neural prosthesis. They could also be used to build a brain-machine interface.
Could this be used to build a brain-machine interface (BMI)?
Decoding visual content is conceptually related to the work on neural-motor prostheses being undertaken in many laboratories. The main goal in the prosthetics work is to build a decoder that can be used to drive a prosthetic arm or other device from brain activity. Of course there are some significant differences between sensory and motor systems that impact the way that a BMI system would be implemented in the two systems. But ultimately, the statistical frameworks used for decoding in the sensory and motor domains are very similar. This suggests that a visual BMI might be feasible.
If you liked this article, please give it a quick review on ycombinator or StumbleUpon. Thanks