Supercomputer Conference: Possible Exascale Disruption and the Best Technical Papers

The Supercomputer conference is Nov 15-20, 2008.

Technological developments in several areas have the potential to impact exascale supercomputer systems in a very disruptive way. These technologies could lead to viable exascale systems in the 2015-2020 timeframe. Four technologies are:

* Quantum computing [Dwave Systems]
* Flash storage Sun Micro has introduced high performance flash from terabytes up a to half a petabyte
* Cheap and low power optical communications Keren Bergman talks about nanophotonics for onchip and interchip communication
* IBM 3D chip stacking

IBM’s leadership in advancing chip-stacking technology in a manufacturing environment announced one year ago, which drastically shortens the distance that information needs to travel on a chip to just 1/1000th of that on 2-D chips and allows the addition of up to 100 times more channels, or pathways, for that information to flow.

IBM researchers are exploring concepts for stacking memory on top of processors and, ultimately, for stacking many layers of processor cores.

IBM scientists were able to demonstrate a cooling performance of up to 180 W/cm**2 per layer for a stack with a typical footprint of 4 cm**2.

Some of the best Technical Papers

Links to all technical paper abstracts are here.

1. High-Radix Crossbar Switches Enabled by Proximity Communication

Parallel applications are usually able to achieve high computational performance but suffer from large latency in I/O accesses. I/O prefetching is an effective solution for masking the latency. Most of existing I/O prefetching techniques, however, are conservative and their effectiveness is limited by low accuracy and coverage. As the processor-I/O performance gap has been increasing rapidly, data-access delay has become a dominant performance bottleneck. We argue that it is time to revisit the “I/O wall” problem and trade the excessive computing power with data-access speed. We propose a novel pre-execution approach for masking I/O latency. We describe the pre-execution I/O prefetching framework, the pre-execution thread construction methodology, the underlying library support, and the prototype implementation in the ROMIO MPI-IO implementation in MPICH2. Preliminary experiments show that the pre-execution approach is promising in reducing I/O access latency and has real potential.

2. Benchmarking GPUs to Tune Dense Linear Algebra

We present performance results for dense linear algebra using the 8-series NVIDIA GPUs. Our GEMM routine runs 60% faster than the vendor implementation and approaches the peak of hardware capabilities. Our LU, QR and Cholesky factorizations achieve up to 80-90% of the peak GEMM rate. Our parallel LU running on two GPUs achieves up to ~300 Gflop/s. These results are accomplished by challenging the accepted view of the GPU architecture and programming guidelines. We argue that modern GPUs should be viewed as multithreaded multicore vector units. We exploit register blocking to optimize GEMM and heterogeneity of the system (compute both on GPU and CPU). This study includes detailed benchmarking of the GPU memory system that reveals sizes and latencies of caches and TLB. We present a couple of algorithmic optimizations aimed at increasing parallelism and regularity in the problem that provide us with slightly higher performance.

3. A Scalable Parallel Framework for Analyzing Terascale Molecular Dynamics Trajectories

As parallel algorithms and architectures drive the longest molecular dynamics (MD) simulations towards the millisecond scale, traditional sequential post-simulation data analysis methods are becoming increasingly untenable. Inspired by the programming interface of Google’s MapReduce, we have built a new parallel analysis framework called HiMach, which allows users to write trajectory analysis programs sequentially, and carries out the parallel execution of the programs automatically. We introduce (1) a new MD trajectory data analysis model that is amenable to parallel processing, (2) a new interface for defining trajectories to be analyzed, (3) a novel method to make use of an existing sequential analysis tool called VMD, and (4) an extension to the original MapReduce model to support multiple rounds of analysis. Performance evaluations on up to 512 processor cores demonstrate the efficiency and scalability of the HiMach framework on a Linux cluster.

The Conference schedule is here.

About The Author

Add comment

E-mail is already registered on the site. Please use the Login form or enter another.

You entered an incorrect username or password

Sorry, you must be logged in to post a comment.


by Newest
by Best by Newest by Oldest

Greetings from NJ:
Regardless of minor details in caluculation or who gets credit, I should point out that if this technology becomes a threat to the oil industry and/or automotive culture we should expect that our elected officials will find a way to legislate it into oblivion therefore bowing to the corporate barons. Beware the ides of March!


Yes, I am surprised as well that this is not getting more attention." REL="nofollow">I have another article about how the Australia's Commonwealth Scientific and Industrial Research Organisation (CSIRO) has developed a cheaper and longer lasting ultracapacitor/lead battery combo

It shows that any capacitor / battery combination can be made to have price and performance advantages


I went to Digg and I want to know why every single one of these articles about this AFS trinity power system don't even have 100 diggs yet. pathetic! when i first came across it i was blown away. this has huge implications if this technology actually, i hope, gets utilized. using the ultra capacitor along with the batteries is an ingenious method of saving the batteries. it is very surprising that know one had the idea before if anyone does know of anyone else or company, please tell me.
This type of technology has the extreme potential of allowing the plug in hybrid to be loved by nearly everyone and i refuse to by a new car until this technology goes on the market!!!


What is new is being able to extend the range of the electrical part. If a Toyota Prius gets a lithium ion battery upgrade then it can get to 100mpg using the same calculations (how much fuel is used when it is recharged every night after some commute distance during the day)

Also, what is new is managing the electrical power more efficiently to allow 40 mile range for an SUV.

also, what is new is the apparently lower production cost of the battery/ultracapacitor combination.

Calcars discusses it and it was reported in Forbes.

the 300mpg Aptera uses a similar calculation of fuel efficiency.

Look at the details of the Aptera under the performance tab

Wikipedia discusses calculation of fuel efficiency using plug in hybrids


The title and article are really misleading. There is nothing new in this technology! It's just a combination of Electricity Car with "unlimited" mpg and Hybrid Car with 30 mpg. Mixing this two, you can get anything from 30 mpg (400 mile range) to infinite mpg (40 mile rang).