Optical communication between cores on a chip and an operating system that can scale to thousands of cores

At MIT, a host of researchers are exploring how to reinvent chip architecture from the ground up, to ensure that adding more cores makes chips perform better, not worse.

One way to improve communication between cores, which the Angstrom project is investigating, is optical communication — using light instead of electricity to move data. Though prototype chips with optical-communications systems have been built in the lab, they rely on exotic materials that are difficult to integrate into existing chip-manufacturing processes. Two of the Angstrom researchers are investigating optical-communications schemes that use more practical materials.

In August 2010, the U.S. Department of Defense’s Defense Advanced Research Projects Agency announced that it was dividing almost $80 million among four research teams as part of a “ubiquitous high-performance computing” initiative. Three of those teams are led by commercial chip manufacturers. The fourth, which includes researchers from Mercury Computer, Freescale, the University of Maryland and Lockheed Martin, is led by MIT’s Computer Science and Artificial Intelligence Lab and will concentrate on the development of multicore systems.

The MIT project, called Angstrom, involves 19 MIT researchers (so far) and is headed by Anant Agarwal, a professor in the Department of Electrical Engineering and Computer Science.

Angstrom member, Vladimir Stojanović of the Microsystems Technology Laboratory, is collaborating with several chip manufacturers to build prototype chips with polysilicon waveguides. Waveguides are ridges on the surface of a chip that can direct optical signals; polysilicon is a type of silicon that consists of tiny, distinct crystals of silicon clumped together. Typically used in the transistor element called the gate, polysilicon has been part of the standard chip-manufacturing process for decades.

Other Angstrom researchers, however, are working on improving electrical connections between cores. In today’s multicore chips, adjacent cores typically have two high-capacity connections between them, which carry data in opposite directions, like the lanes of a two-lane highway. But in future chips, cores’ bandwidth requirements could fluctuate wildly

.

Future Dynamic Resource Allocation Operating System for chips with thousands of cores

A computer with hundreds or thousands of cores tackling different aspects of a problem and exchanging data offers much more opportunity than an ordinary computer does for something to go badly wrong.

At the same time, it has more resources to throw at any problems that do arise. So, says Anant Agarwal, who leads the Angstrom project, a multicore operating system needs both to be more self-aware — to have better information about the computer’s performance as a whole — and to have more control of the operations executed by the hardware.

To some extent, increasing self-awareness requires hardware: Each core in the Angstrom chip, for instance, will have its own thermometer, so that the operating system can tell if any part of the chip is overheating. But crucial to the Angstrom operating system — dubbed FOS, for factored operating system — is a software-based performance measure, which Agarwal calls “heartbeats.” Programmers writing applications to run on FOS will have the option of setting performance goals: A video player, for instance, may specify that the playback rate needs to be an industry standard 30 frames per second. Software will automatically interpret that requirement and emit a simple signal — a heartbeat — each time a frame displays.

If the heartbeat fell below 30, FOS could allocate more cores to the video player. Alternatively, if system resources are in short supply, it could adopt some computational short cuts in order to get the heartbeat back up again. Computer-science professor Martin Rinard’s group has been investigating cases where accuracy can be traded for speed and has developed a technique it calls “loop perforation.” A loop is an operation that’s repeated on successive pieces of data — like, say, pixels in a frame of video — and to perforate a loop is simply to skip some iterations of the operation. Graduate student Hank Hoffmann has been working with Agarwal to give FOS the ability to perforate loops on the fly.

* self-aware hardware

* allow programmers to specify several different algorithms for each task a program performs, and the operating system automatically selects the one that works best under any given circumstances.

* Sending a core its assignment actually consumes four times as much bandwidth as swapping the contents of caches does, so it also consumes more energy. But in multicore chips, multiple cores will frequently have cached copies of the same data. If one core modifies its copy, all the other copies have to be updated, too, which eats up both energy and time. By reducing the need for cache updates, Devadas says, a multicore system that uses his approach could outperform one that uses the traditional approach.

If you liked this article, please give it a quick review on ycombinator or StumbleUpon. Thanks