A new type of neural network made with memristors can dramatically improve the efficiency of teaching machines to think like humans. The network, called a reservoir computing system, could predict words before they are said during conversation, and help predict future outcomes based on the present.
Reservoir computing systems, which improve on a typical neural network’s capacity and reduce the required training time, have been created in the past with larger optical components. However, the U-M group created their system using memristors, which require less space and can be integrated more easily into existing silicon-based electronics.
Memristors are a special type of resistive device that can both perform logic and store data. This contrasts with typical computer systems, where processors perform logic separate from memory modules. In this study, Lu’s team used a special memristor that memorizes events only in the near history.
Inspired by brains, neural networks are composed of neurons, or nodes, and synapses, the connections between nodes.
To train a neural network for a task, a neural network takes in a large set of questions and the answers to those questions. In this process of what’s called supervised learning, the connections between nodes are weighted more heavily or lightly to minimize the amount of error in achieving the correct answer.
Once trained, a neural network can then be tested without knowing the answer. For example, a system can process a new photo and correctly identify a human face, because it has learned the features of human faces from other photos in its training set.
“A lot of times, it takes days or months to train a network,” says Lu. “It is very expensive.”
Image recognition is also a relatively simple problem, as it doesn’t require any information apart from a static image. More complex tasks, such as speech recognition, can depend highly on context and require neural networks to have knowledge of what has just occurred, or what has just been said.
“When transcribing speech to text or translating languages, a word’s meaning and even pronunciation will differ depending on the previous syllables,” says Lu.
This requires a recurrent neural network, which incorporates loops within the network that give the network a memory effect. However, training these recurrent neural networks is especially expensive, Lu says.
Reservoir computing systems built with memristors, however, can skip most of the expensive training process and still provide the network the capability to remember. This is because the most critical component of the system – the reservoir – does not require training.
When a set of data is inputted into the reservoir, the reservoir identifies important time-related features of the data, and hands it off in a simpler format to a second network. This second network then only needs training like simpler neural networks, changing weights of the features and outputs that the first network passed on until it achieves an acceptable level of error.
The team proved the reservoir computing concept using a test of handwriting recognition, a common benchmark among neural networks. Numerals were broken up into rows of pixels, and fed into the computer with voltages like Morse code, with zero volts for a dark pixel and a little over one volt for a white pixel.
Using only 88 memristors as nodes to identify handwritten versions of numerals, compared to a conventional network that would require thousands of nodes for the task, the reservoir achieved 91% accuracy.
Reservoir computing systems are especially adept at handling data that varies with time, like a stream of data or words, or a function depending on past results.
To demonstrate this, the team tested a complex function that depended on multiple past results, which is common in engineering fields. The reservoir computing system was able to model the complex function with minimal error.
Lu plans on exploring two future paths with this research: speech recognition and predictive analysis.
“We can make predictions on natural spoken language, so you don’t even have to say the full word,” explains Lu.
“We could actually predict what you plan to say next.”
In predictive analysis, Lu hopes to use the system to take in signals with noise, like static from far-off radio stations, and produce a cleaner stream of data. “It could also predict and generate an output signal even if the input stopped,” he says.
Reservoir computing systems utilize dynamic reservoirs having short-term memory to project features from the temporal inputs into a high-dimensional feature space. A readout function layer can then effectively analyze the projected features for tasks, such as classification and time-series analysis. The system can efficiently compute complex and temporal data with low-training cost, since only the readout function needs to be trained. Here we experimentally implement a reservoir computing system using a dynamic memristor array. We show that the internal ionic dynamic processes of memristors allow the memristor-based reservoir to directly process information in the temporal domain, and demonstrate that even a small hardware system with only 88 memristors can already be used for tasks, such as handwritten digit recognition. The system is also used to experimentally solve a second-order nonlinear task, and can successfully predict the expected output without knowing the form of the original dynamic transfer function.
In this study, we demonstrate a memristor-based RC system by utilizing the internal, short-term ionic dynamics of memristor devices. We show experimentally that even a small reservoir consisting of 88 memristor devices can be used to process real-world problems such as handwritten digit recognition with performance comparable to those achieved in much larger networks. A similar-sized network is also used to solve a second-order nonlinear dynamic problem and is able to successfully predict the expected dynamic output without knowing the form of the transfer function.
It should be noted that the system is not fully optimized for the handwritten digit recognition task yet so the performance could still be improved further. First, information from the original data has already been partially lost during the preprocessing, such as transforming the grayscale image to binary data. Second, the pulse amplitude, width and rates could still be fine-tuned to maximum classification yield. Additionally, while normal neural networks aim to extract features across the image from several rows simultaneously, the reservoir presented here only processes each row separately and independently. A quick solution would be to scan the digit also in the vertical direction and input each column to the reservoir to allow relations between the rows to be processed by the reservoir as well. Indeed, adding vertical scan can improve the classification accuracy to 92.1% as verified through simulation using the device model, although the system also becomes larger and requires 672 inputs.
The computing capacity added by the memristor-based reservoir layer was analyzed by comparing the RC system performance with networks having the same connectivity patterns, by replacing the reservoir layer with a conventional nonlinear downsampling function. The RC system outperforms the conventional approach and the advantage is significant at small readout network sizes, even for the image analysis task that is not naturally fitted for RC. For the second-order dynamic problem that is more naturally suited for the RC system, our analysis shows that the small RC system significantly outperforms a conventional linear network, with orders-of-magnitude improvements in prediction NMSE. We also show that the inherent device variations, which can pose significant challenges for some applications, become a benefit for RC systems, as they help make the reservoir states more separable.
The demonstration of memristor-based RC systems will stimulate continued developments to further optimize the network performance toward broad applications in areas, such as speech analysis, action recognition and prediction. This approach will also be attractive for applications that do not require fast processing speed but have strong constraints on memory size and computation power. Finally, we want to note that the crossbar used in this work mainly provides the high-density devices, and the devices function independently in the reservoir since the short-term memory property is a native property of the device itself. Future algorithm and experimental advances that can take full advantage of the interconnected nature of the crossbar structures, by utilizing the intrinsic sneak paths and possible loops in the system may further enhance the computing capacity of the system.