Study of Scalable Declustering Algorithms for Parallel Grid Files

Bongki Moon, Anurag Acharya, Joel Saltz To appear in IPPS'96

Abstract:

Efficient storage and retrieval of large multidimensional datasets has been one of the key issues in large-scale scientific computations such as long running time-dependent simulations periodically generating snapshots of the state. The main challenge for efficiently handling such datasets is to minimize response time for multidimensional range queries. The grid file is one of the well known access methods for multidimensional and spatial data. We investigate effective and scalable declustering techniques for grid files with the primary goal of minimizing response time and the secondary goal of maximizing the fairness of data distribution. The main contributions of this paper is (1) the analytic and experimental evaluation of existing index-based declustering techniques and their extensions for grid files, and (2) the development of an improved proximity-based declustering algorithm called minimax which has been experimentally shown to scale well and consistently achieve better response time compared to all the other algorithms while maintaining perfect disk distribution.

Postscript (compressed 259K)

A previous version appeared as UMCP-CSD:CS-TR-3589 (compressed 280K)