Improving OLAP Performance by Multidimensional Hierarchical Clustering

Markl Volker; Ramsak Frank; Bayer Rudolf

Benutzer: Gast

I0007

Wenn Sie Schwierigkeiten haben, das Dokument zu öffnen, versuchen Sie auch bitte diesen Link

Titel:: Improving OLAP Performance by Multidimensional Hierarchical Clustering
Dokumenttyp:: Technical Report
Autor(en):: Markl Volker; Ramsak Frank; Bayer Rudolf
Abstract:: Data-warehousing applications cope with enormous data sets in the range of Gigabytes and Terabytes. Queries usually either select a very small set of this data or perform aggregations on a fairly large data set. Materialized views storing pre-computed aggregates are used to efficiently process queries with aggregations. This approach increases resource requirements in disk space and slows down updates because of the view maintenance problem. Multidimensional hierarchical clustering of OLAP data overcomes these problems while offering more flexibility for aggregation paths. In addition it has the potential to replace several bitmap indexes which are used to efficiently process high selectivity queries. We investigate query processing in OLAP environments and identify typical query patterns. Clustering is introduced as a way to speed up aggregation queries without additional storage cost. We also show the potential of multidimensional hierarchical clustering to reduce storage and maintenance cost for indexes. Clustering possibilities for OLAP data are investigated. The UB-Tree and the Tetris algorithm as physical storage structure and access method for clustered OLAP data are described. Performance and storage cost of our access method are investigated and compared to current query processing scenarios. In addition performance measurements on real world data for a typical star schema are presented. «
Data-warehousing applications cope with enormous data sets in the range of Gigabytes and Terabytes. Queries usually either select a very small set of this data or perform aggregations on a fairly large data set. Materialized views storing pre-computed aggregates are used to efficiently process queries with aggregations. This approach increases resource requirements in disk space and slows down updates because of the view maintenance problem. Multidimensional hierarchical clustering of OLAP data... »
Stichworte:: UB-Tree; hierarchy encoding; OLAP; Data Warehousing; Multidimensional Clustering; Hierarchical Clustering; Performance
Jahr:: 2000
Jahr / Monat:: 2000-03-01 00:00:00
Seiten/Umfang:: 26
BibTeX

Vorkommen:

mediaTUM Gesamtbestand Einrichtungen Schools TUM School of Computation, Information and Technology Technische Berichte 2000