Efficient mining of association rules in distributed databases

Cheung, David W.; Ng, Vincent; Fu, Ada Wai-Chee; Fu, Yongjian

doi:10.1109/69.553158

Cited by 277 publications

(84 citation statements)

References 13 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Either way of distribution divides the rows or columns of a table into different parts. Distributed data mining approaches for horizontally partitioned data include meta-learning [3] that merges models built from different sites, and privacy preserving techniques including decision tree [8] and association rule mining [7]. Those for vertically partitioned data include association rule mining [12] and k-means clustering [13].…”

Section: Related Workmentioning

confidence: 99%

“…However, perfect integration of heterogeneous data sources is a very challenging problem, and it is often impossible to migrate one whole database to another site. In contrast, distributed data mining [3,7,8,12,13] aims at discovering knowledge from a dataset that is stored at different sites. But they focus on a homogeneous dataset (a single table or a set of transactions) that is distributed to multiple sites, thus are unable to handle heterogeneous relational databases.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Efficient Classification from Multiple Heterogeneous Databases

Yin

2005

Knowledge Discovery in Databases: PKDD 2005

View full text Add to dashboard Cite

Abstract. With the fast expansion of computer networks, it is inevitable to study data mining on heterogeneous databases. In this paper we propose MDBM, an accurate and efficient approach for classification on multiple heterogeneous databases. We propose a regression-based method for predicting the usefulness of inter-database links that serve as bridges for information transfer, because such links are automatically detected and may or may not be useful or even valid. Because of the high cost of inter-database communication, MDBM employs a new strategy for cross-database classification, which finds and performs actions with high benefit-to-cost ratios. The experiments show that MDBM achieves high accuracy in cross-database classification, with much higher efficiency than previous approaches.

show abstract

Section: Related Workmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Efficient Classification from Multiple Heterogeneous Databases

Yin

2005

Knowledge Discovery in Databases: PKDD 2005

View full text Add to dashboard Cite

show abstract

“…[14][15][16][17][18][19] have focused on parallel and distributed algorithms for mining association rules, but not the closed itemsets mining. In [1], an algorithm of synthesizing high frequency rules from different data sources based on weight model is proposed.…”

Section: Related Workmentioning

confidence: 99%

Distributed Frequent Closed Itemsets Mining

Liu

Zhang

Cai

et al. 2007

2007 Third International IEEE Conference on Signal-Image Technologies and Internet-Based System

View full text Add to dashboard Cite

“…One of the most important fields of the datamining domain is the association mining. Many interesting and efficient mining of association rule algorithms have been proposed in the different mining literature [2,3,4,5,6,8,9,10,11]. In this paper we present an efficient association-mining algorithm for large dataset.…”

Section: Introductionmentioning

confidence: 99%

A Compress-Based Association Mining Algorithm for Large Dataset

Ashrafi

Taniar

Smith

2003

Lecture Notes in Computer Science

View full text Add to dashboard Cite

Abstract. The association mining is one of the primary sub-areas in the field of data mining. This technique had been used in numerous practical applications, including consumer market basket analysis, inferring patterns from web page access logs, network intrusion detection and pattern discovery in biological applications. Most of the traditional association-mining algorithms assume that whole dataset can be loaded in the main memory. Hence, problem arise when such algorithms is applied in large dataset. In this paper we present a new algorithm for association mining. Our algorithm is efficient when the size of dataset is huge that cannot be load in the main memory. The proposed algorithm also reduces the frequent itemsets search space, by eliminating non-frequent 1-itemsets after the first pass. Our performance evaluation shows algorithm outperforms Apriori algorithm in different datasets.

show abstract

Efficient mining of association rules in distributed databases

Cited by 277 publications

References 13 publications

Efficient Classification from Multiple Heterogeneous Databases

Efficient Classification from Multiple Heterogeneous Databases

Distributed Frequent Closed Itemsets Mining

A Compress-Based Association Mining Algorithm for Large Dataset

Contact Info

Product

Resources

About