The Design and Evaluation of a High-Performance Earth Science Database

Carter Shock, Chialin Chang, Bongki Moon, Anurag Acharya, Larry Davis, Joel Saltz. Alan Sussman,

To appear in the special issue of Parallel Computing on Parallel Data Servers and Applications

Abstract:

Earth scientists have encountered two major obstacles in their attempts to use remotely sensed imagery to analyze the earth's land cover dynamics. First, the volume of data involved is very large and second, significant preprocessing is needed before the data can be used. This is particularly so for studies that analyze global trends using data sets that cover multiple years. In this paper, we present the design of an earth science database as well as our early experiences with it. The primary design goal of this database is to facilitate efficient access to and preprocessing of large volumes of satellite data. Our initial design assumed that the main bottleneck in the system would be retrieving data from the disks. However, experimental results show that precise identification of all the data values corresponding to a query can take a significant amount of time. The problem is even more pronounced in designing the system to attempt to minimize time spent performing I/O. We therefore discuss a major redesign of the system that includes a reworking of the indexing scheme and a reorganization of the data on disks. Preliminary experimental results show that the redesigned system performs significantly better than the original system, providing interactive response times for local queries.

Postscript (compressed 178K)