Nowadays, distributed relational databases constitute a large part of information storage handled by a variety of users. The knowledge extraction from these databases has been studied massively during this last decade. However, the problem still present in the distributed data mining process is the communication cost between the different parts of the database located naturally in remote sites. We present in this paper a decision tree classification approach with a low cost communication strategy using a set of the most useful inter-base links for the classification task. Different experiments conducted on real datasets showed a significant reduction in communication costs and an accuracy almost identical to some traditional approaches.
Abstract. Nowadays, Multiple Sequence Alignment (MSA) approaches do not always provide consistent solutions. In fact, alignments become increasingly difficult when treating low similarity sequences. Tabu Search is a very useful meta-heuristic approach in solving optimization problems. For the alignment of multiple sequences, which is a NP-hard problem, we apply a tabu search algorithm improved by several neighborhood generation techniques using guide trees. The algorithm is tested with the BAliBASE benchmarking database, and experiments showed encouraging results compared to the algorithms studied in this paper.
Voluminous geographic data have been, and continue to be, collected from various Geographic Information Systems (GIS) applications such as Global Positioning Systems (GPS) and high-resolution remote sensing. For these applications, huge amount of data is maintained in multiple disparate databases and different in spatial data type, file formats, data schema, access mechanism, etc. Spatial data mining and knowledge discovery has emerged as an active research field that focuses on the development of theory, methodology, and practice for the extraction of useful information and knowledge from massive and complex spatial databases. This chapter highlights recent theoretical and applied research in geographic knowledge discovery and spatial data mining in a distributed environment where spatial data are dispersed in multiple sites. The author will present in this chapter, an overall picture of how spatial multi-database mining is achieved through several common spatial data-mining tasks, including spatial cluster analysis, spatial association rule and spatial classification.
Voluminous geographic data have been, and continue to be, collected from various Geographic Information Systems (GIS) applications such as Global Positioning Systems (GPS) and high-resolution remote sensing. For these applications, huge amount of data is maintained in multiple disparate databases and different in spatial data type, file formats, data schema, access mechanism, etc. Spatial data mining and knowledge discovery has emerged as an active research field that focuses on the development of theory, methodology, and practice for the extraction of useful information and knowledge from massive and complex spatial databases. This chapter highlights recent theoretical and applied research in geographic knowledge discovery and spatial data mining in a distributed environment where spatial data are dispersed in multiple sites. The author will present in this chapter, an overall picture of how spatial multi-database mining is achieved through several common spatial data-mining tasks, including spatial cluster analysis, spatial association rule and spatial classification.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.