Computer Science Department University of Maryland College Park
The data cube is an aggregate operator which has been shown to be very powerful for On Line Analytical Processing (OLAP) in the context of data warehousing. It is, however, very expensive to compute, access, and maintain. In this paper we define the ``cubetree'' as a storage abstraction of the cube and realize it using packed R-trees for most efficient cube queries. We then reduce the problem of creation and maintenance of the cube to sorting and incremental bulk merge-packing of cubetrees. This merge-pack has been implemented to use separate storage for writing the updated cubetrees, therefore allowing cube queries to continue even during maintenance. Finally, we characterize the size of the delta increment for achieving good bulk update schedules for the cube. The paper includes experiments with various data sets measuring query and bulk update performance.