Nvidia’s 2009 GPU compute power predictions and recent chips

Back in 2009, Nvidia’s CEO Huang predicted that GPU (Graphical Processing Units) will increase in power by 570 times over six years (up to 2016) from current levels. This would require tripling the speed of the GPU every year.

William J. Dally, Chief scientist at Nvidia Corp, predicted Nvidia GPUs in 2015 will be implemented on 11 nm process technology that feature roughly 5,000 cores and 20 teraflops of performance. The 2009 Nvidia GPUs had 500 gigaflops of performance in single precision. 20 teraflops would be 40 times faster. 570 times faster in 2016 would be 285 teraflops. However, if Huang was referring to double precision then the increase would be from the current 100 gigaflops going up to 57 teraflops of double precision performance.

In September, 2012 GK110 GPU chip, sometimes called the Kepler2, has over 7.1 billion transistors etched on a die by foundry Taiwan Semiconductor Manufacturing Corp using its much-sought 28 nanometer processes. It sports 15 SMX (streaming multiprocessor extreme) processing units, each with 192 single-precision CUDA cores and 64 double-precision floating point units tacked on to every triplet of CUDA cores. That gives you 960 DP floating point units across a maximum of 2,880 CUDA cores on the GK110 chip.

Nvidia has been vague about absolute performance, but the GK110 to deliver just under 2 teraflops of raw DP floating point performance at 1GHz clock speeds on the cores and maybe 3.5 teraflops at single precision. That’s around three times the oomph – and three times the performance per watt of the thermals are about the same – of the existing Fermi GF110 GPUs used in the Tesla M20 series of GPU coprocessors.

The Tesla K10 GPU coprocessor puts two GK104 chips on a PCI-Express card and delivers 4.58 teraflops of SP number-crunching in a 225 watt thermal envelope – a staggering 3.5X the performance of the Fermi M2090 coprocessor.

The GK110 should have 20 times the double precision performance of the 2009 chip and 7 tunes the single precision performance. The GK110 is close to the tripling every year pace in double precision improvement.

Nextbigfuture had covered the Nvidia Tesla K10 chip and its AMD competition.

VR-zone reported on the delay for the Nvidia Maxwell chip to 2014 Maxwell will be a 20 nanometer node GPU chip. It will be followed in 2015 or 2016 by Nvidia’s Einstein chip. Einstein will be an 11 nanometer or 14 nanometer GPU chip.

The Maxwell GPU architecture, which should drive 2014 as one of key years in Nvidia’s history. Maxwell will be the first top-to-bottom GPU architecture, powering everything from Tegra to Tesla. Furthermore, Maxwell should be the first GPU part to integrate the 64-bit ARM core which carries the codename “Project Denver”. Putting the typically-bandwidth starved ARM cores onto an internal bus which in GPUs goes beyond 1.5TB/s should significantly change the playing game – a GPU capable of booting an operating system, regardless of what lies currently in public documents.

All in all, 2013 will see AMD’s Sea Islands fight first versus Nvidia’s Kepler refresh, and only then against the Maxwell. Real battle will come only in 2014.

Maxwell should have 16 gigaflops per watt.

It looks like Nvidia is tracking to be about a year (double precision) or two (single precision) behind its 2009 prediction of a 570 times increase in GPU computing power.

If you liked this article, please give it a quick review on ycombinator or StumbleUpon. Thanks