IBM has shifted cell processors to 65 nanometers and improved double precision performance by up to five times (PowerXCell 8i processors). Double precision performance is very important for scientific and supercomputing applications.
New PowerXCell 8i processors in QS22 blades have :
● 460 single precision (SP) GFLOPS/217 double precision (DP) GFLOPS per blade
● 6.4/3.0 TFLOPS (SP/DP peak) in a single BladeCenter chassis (14 blades)
● 25.8/12.18 TFLOPS (SP/DP peak) in a standard 42U rack with 56 blades installed
Here is a comparison of the latest FGPAs versus the latest quad-core Opteron. 2.5 GHz quad-core Opterons and Virtex-5 LX330, SX95T and recently announced SX240T FPGAs.
For the quad-core Opteron, this equates to a theoretical peak of (4 ops/clk * 4 cores * 2.5 GHz) 40 Gflop/s in 64-bit mode and 80 Gflop/s in 32-bit mode. For actual predicted performance, microprocessors use DGEMM (64-bit matrix multiply), which is typically 80 percent to 90 percent lower then the peak.
The SX240T can achieve 1.5 to 4.78 times more speed than the DGEMM speeds. DGEMM performance on a microprocessor is the best actual performance and typically hand coded in assembler by the microprocessor vendor. Typical user code that has been run through a compiler normally achieves maybe 25 percent of the peak, and even less as the number of cores increases
AMD’s FireStream 9170 chipset includes 660 million transistors and 320 processing units and gets 500 peak gigaflops. The FireStream 9170 is a step on the way to AMD’s Fusion project, which the company says will combine a graphics processor and general processor on the same piece of silicon. AMD hopes to release Fusion in 2009.
A comparison of Cell processors, GPGPUs, and FPGAs