The Opsis Project - CubeTrees

Overview

Air Cache

Cubetrees

Dynamat

MOCHA

Online Updates

Transcurrent Execution Model

WebViews

People

Papers

Demos

Links

CubeTree Storage Organization
In this work, we introduced the Cubetree as a storage abstraction of the Data Cube and argued for a compact representation for it using packed R-trees. We developed algorithms for allocating each cube query to the projected Cubetree that gives good clustering and for selecting good sort order within each Cubetree. We then used Cubetree as an alternative storage organization for ROLAP aggregate views and presented an efficient algorithm for mapping an arbitrary set of views to a collection of Cubetrees that achieve excellent performance.
The Cubetree organization combines both storage and indexing in a single data structure and still within the relational paradigm. We have shown that when compared to a conventional (relational) storage organization of materialized OLAP views, Cubetrees offer at least a 2-1 storage reduction, a 10-1 better OLAP query performance, and a 100-1 faster updates. We compared the two alternative approaches with data generated from the TPC-D benchmark and stored in the Informix Universal Server (IUS). The straight forward implementation materializes the Relational-OLAP (ROLAP) views using IUS tables which are then indexed with B-trees. The Cubetree implementation materializes the same ROLAP views using a Cubetree Datablade developed for IUS. This Datablade implements the Cubetrees, the mapping algorithm and the supporting routines on the Informix Universal Server. It defines a Cubetree access method as an alternative primary storage organization for materialized views and provides the end-user with a clean and transparent SQL interface. For all Cubetree experiments we used this interface to make fair comparisons of the Cubetree performance with a commercial industrial strength system. The experiments demonstrate that the Cubetree storage organization is superior in storage, query performance and update speed.
Perhaps the most critical issue in data warehouse environments is the time to generate and/or refresh its derived data from the raw data. The mere size of them does not permit frequent re-computation. Creating a new cube every time an update increment is obtained is not only wasteful but, it may require a window that leaves no time for OLAP. This is especially critical because the off-line window for computing the cube and its indexes has shrank due to the international operations of the organizations. In this work we argued that maintenance of the cube should be considered as a bulk incremental update operation and that record-at-a-time updates or full re-computation are not viable solutions. We have reduced the incremental refresh problem to sorting the update increment and merge-packing the Cubetrees. Rewriting the Cubetrees into fresh storage permits the use of the old Cubetrees during maintenance and, thus, eliminates query down time. Then, based on the rewrite cost, we investigated how to obtain good refresh schedules for amortizing maintenance cost.
CubeTrees Papers

An Alternative Storage Organization for ROLAP Aggregate Views Based on Cubetrees

Yannis Kotidis, Nick Roussopoulos. In the Proceedings of the ACM SIGMOD International Conference on Management of Data, Seattle, Washington, June 1998

Available in: gzipped Postscript

A Generalized Framework for Indexing OLAP Aggregates

Yannis Kotidis, Nick Roussopoulos. Technical Report, Dept of Computer Science, Univ. of Maryland, October 1997

Available in: gzipped Postscript

Cubetree: Organization of and Bulk Incremental Updates on the Data Cube

Nick Roussopoulos, Yannis Kotidis, Mema Roussopoulos. In the Proceedings of the ACM SIGMOD International Conference on Management of Data, Tucson, Arizona, May 1997

Available in: gzipped Postscript

CubeTrees People

Nick Roussopoulos
Yannis Kotidis

Last update was on January 16, 2002	Page Design
Send comments or questions to Nick Roussopoulos

Web Accessibility