- AMD Opteron: Commodity Approach - Lower efficiency for scientific applications
offset by cost efficiencies of mass market
• Popular building block for HPC, from commodity to tightly-coupled XT3.
• Our AMD pricing is based on servers only without interconnect
- BlueGene/L: Use generic embedded processor core and customize System on Chip
(SoC) services around it to improve power efficiency for scientific applications
• Power efficient approach, with high concurrency implementation
• BG/L SOC includes logic for interconnect network
- Tensilica: In addition to customizing the SOC, also customizes the CPU core for
further power efficiency benefits but maintains programmability
• Design includes custom chip, fabrication, raw hardware, and interconnect
10 petaflops of sustained performance would cost 10-20 times more, which would be available for the same price in 5 years with Moore's Law.
So by 2012-2013, a 100-200 petaflop peak performance supercomputer based on configurable processors would be $75 million and an exaflop supercomputer would be in the $375-750 million range in 2012-2013.
The development of a lot of petaflop affordable power in supercomputers would help fulfill a couple of my computing predictions from 2006
10 petaflop supercomputer by 2012-2013
Petaflop personal computers and wearable computing 2016-2018
Personal petaflop machines seem likely to come about from better GPGPUs, FPGAs and mainstreaming several configurable components.
Another breakthrough is for four times as much memory in cheaper servers. More memory is needed for high performance applications
New memory controller allows four times as much memory to be placed into existing servers
MetaSDRAM is a drop-in solution that closes the gap between processor computing power, which doubles every 18 months -- and DRAM capacity, which doubles only every 36 months. Until now, the industry addressed this gap by adding higher capacity, but not readily available, and exponentially more expensive DRAM to each dual in-line memory module (DIMM) on the motherboard.
The MetaSDRAM chipset, which sits between the memory controller and the DRAM, solves the memory capacity problem cost effectively by enabling up to four times more mainstream DRAMs to be integrated into existing DIMMs without the need for any hardware or software changes. The chipset makes multiple DRAMs look like a larger capacity DRAM to the memory controller. The result is "stealth" high-capacity memory that circumvents the normal limitations set by the memory controller. This new technology has accelerated memory technology development by 2-4 years.
Powerpoint describing the Berkeley National Lab plan for customized chips for more efficiency and powerful supercomputers
Research paper on the IBM Kittyhawk project to build a global scale computer IBM wants to use supercomputers to handle many kinds of large scale applications more efficiently than with clusters of boxes.
A glimpse of how this might take shape was revealed in a recent IBM Research paper that described using the Blue Gene/P supercomputer as a hardware platform for the Internet. The authors of the paper point to Blue Gene's exceptional compute density, highly efficient use of power, and superior performance per dollar. Regarding the drawbacks of the current infrastructure of the Internet, the authors write:
At present, almost all of the companies operating at web-scale are using clusters of commodity computers, an approach that we postulate is akin to building a power plant from a collection of portable generators. That is, commodity computers were never designed to be efficient at scale, so while each server seems like a low-price part in isolation, the cluster in aggregate is expensive to purchase, power and cool in addition to being failure-prone.
The IBM'ers are certainly talking about a more general-purpose petascale application than the Berkeley researchers, but one aspect is the same: ditch the loosely coupled, commodity-based systems in favor of a tightly coupled, customized architecture that focuses on low power and high throughput. If this is truly the model that emerges for ultra-scale computing, then the whole industry is in for a wild ride.