HPCWire – there are 35 TOP500 systems with NVIDIA GPUs (twice as many as in June). Of these, three of the top five supercomputers are equipped with GPUs, with more on the way in 2012 with the 20-petaflop Titan system at Oak Ridge National Lab and the 11.5 petaflop Blue Waters super at NCSA.
In Huang’s SC11 keynote, he pointed out that the rise of HPC-style GPU computing has come about because traditional CPUs, especially x86 ones, have become rather inefficient at compute- and data-intensive computation. For example, he said CPUs use 50 times the energy to schedule the instructions and 20 times the energy to move the data than doing the actual calculation.
GPUs, by contrast, are designed to reduce data movement, and although they have poor single threaded performance because of their simple processing engines, there are many more of them to do the work in parallel. That makes for more efficient computation, assuming the application can be molded into the GPU computing model.
Huang believes the demand for energy efficient HPC flops will work in NVIDIA’s favor, noting that “supercomputers have become power limited. — just like cell phones, just like tablets.” From his perspective, future GPUs will be the platform of choice to power exaflop machines. And although Huang said those supercomputers will be able to perform at that level with just 20 MW, his crystal ball doesn’t have that happening until 2022.
In that timeframe, a second or third generation integrated ARM-GPU processor will be the most likely design. NVIDIA’s “Maxwell” GPU generation, scheduled to make its appearance in the middle of the decade, is slated to be the first NVIDIA platform to integrate their upcoming “Project Denver” ARM CPU, a homegrown design that will become the basis for all of the company’s product lines. From then on, it’s safe to assume that integration will just get tighter. By 2022, it may not make much sense to even refer to these heterogeneous processors as GPUs anymore.
NVIDIA’s early lead in the HPC accelerator business is not insurmountable though. Intel is also positioning itself to be the dominant chip maker of the exascale era, drawing its own line in the sand with a target of 2018 for an Intel-powered exaflop machine. The most likely processor design for such a system will involve Xeon cores integrated with MIC cores on the same chip, although no public plans to that effect have been aired.
AMD has been more equivocal with regard to its exascale aspirations, but the company has certainly been the early mover in heterogeneous CPU-GPU designs with its Fusion APU architecture. Their near-term plans involve putting high-end “Bulldozer” cores into an APU next year as well as adding ECC to their GPU computing line.
Their could be other vendors to challenge NVIDIA and its competitors for the future of supercomputing. Texas Instruments, for example, has just officially launched a floating point DSP with rather impressive performance/watt numbers that is being cross-targeted to HPC. Other ARM vendors could get into the act too, especially if the chip is able to establish itself in the server space with the upcoming 64-bit designs.
The lesson of NVIDIA, pointed out by Huang in his keynote, is that disruptive technologies, like GPU computing, often emerge from new products, like cell phones and tablets, which quickly ramp into volume markets. And although NVIDIA has managed to exploit that phenomenon very effectively for HPC over the last five years, it is unlikely to be the last company to do so. The volume market for the processor of the exascale era may not even exist yet.