GPFS Scans 10 Billion Files in 43 Minutes

IBM GPFS Scans 10 Billion Files in 43 Minutes (28 pages)

By using a small cluster of ten IBM xSeries® servers, IBM’s cluster file system (GPFS™), and by placing file system metadata on a new solid-state storage appliance from Violin Memory, IBM Research demonstrated, for the first time, the ability to do policy-guided storage management (daily tasks such as file selection for backup, migration, etc.) for a 10-billion-file environment in 43 minutes. This new record shatters previous record by factor of 37. GPFS also set the previous record in 2007

This document describes a demonstration that shows GPFS taking 43 minutes to process the 6.5 TBs of metadata needed for a file system containing 10 Billion files. This accomplishment combines the use of enhanced algorithms in GPFS with the use of solid-state storage as the GPFS metadata store. IBM Research once again breaks the barrier of GPFS scalability to scale out to an unprecedented file system size and enable much larger data environments to be unified on a single platform and dramatically reduce and simplify the data management tasks, such as data placement, aging, backup and replication.

If you liked this article, please give it a quick review on ycombinator or StumbleUpon. Thanks