Parallel maintenance of materialized views on personal computer clusters
Date
Authors
Liang, Weifa
Yu, Jeffrey X.
Journal Title
Journal ISSN
Volume Title
Publisher
Access Statement
Abstract
A data warehouse is a repository of integrated information, which collects and maintains a large amount of data from multiple distributed, autonomous, and possibly heterogeneous data sources. Often the data are stored in the form of materialized views in order to provide fast access to the integrated data. How to maintain the warehouse data completely consistently with the remote source data is a challenging issue in a distributed environment. Transactions containing multiple updates at one or multiple sources further complicate this consistency issue. Due to the fact that a data warehouse usually contains a very large amount of data and its processing is time consuming, it becomes inevitable to introduce parallelism to data warehousing. The popularity and cost-effective parallelism brought by the PC cluster makes it a promising platform for this purpose. This article considers the complete consistency maintenance of select-project-join (SPJ) materialized views. Based on a PC cluster consisting of K personal computers, several parallel maintenance algorithms for the materialized views are presented. The key behind the proposed algorithms is how to trade off the work load among the PCs and how to balance the communications cost among the PCs as well as between the PC cluster and remote sources.
Description
Citation
Collections
Source
International Journal of Parallel and Distributed Systems and Networks
Type
Book Title
Entity type
Publication