2011
DOI: 10.1002/cpe.1877
|View full text |Cite
|
Sign up to set email alerts
|

Distributed data mining patterns and services: an architecture and experiments

Abstract: SUMMARY Distributed data mining implements techniques for analyzing data on distributed computing systems by exploiting data distribution and parallel algorithms. The grid is a computing infrastructure for implementing distributed high‐performance applications and solving complex problems, offering effective support to the implementation and use of data mining and knowledge discovery systems. The Web Services Resource Framework has become the standard for the implementation of grid services and applications, a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
7
0

Year Published

2014
2014
2022
2022

Publication Types

Select...
4
1

Relationship

2
3

Authors

Journals

citations
Cited by 20 publications
(7 citation statements)
references
References 26 publications
0
7
0
Order By: Relevance
“…The validity of this assumption has been confirmed in several studies (ie, other works 22,23 ), which experimentally show that the time needed to send local or summarized information is negligible with respect to the computation time in a large subset of distributed applications. The validity of this assumption has been confirmed in several studies (ie, other works 22,23 ), which experimentally show that the time needed to send local or summarized information is negligible with respect to the computation time in a large subset of distributed applications.…”
Section: Figurementioning
confidence: 72%
See 1 more Smart Citation
“…The validity of this assumption has been confirmed in several studies (ie, other works 22,23 ), which experimentally show that the time needed to send local or summarized information is negligible with respect to the computation time in a large subset of distributed applications. The validity of this assumption has been confirmed in several studies (ie, other works 22,23 ), which experimentally show that the time needed to send local or summarized information is negligible with respect to the computation time in a large subset of distributed applications.…”
Section: Figurementioning
confidence: 72%
“…The assumption that the communication time is very low with respect to the computation time, and thus can be ignored, is based on the fact that each node sends only summary data to the adjacent nodes. The validity of this assumption has been confirmed in several studies (ie, other works 22,23 ), which experimentally show that the time needed to send local or summarized information is negligible with respect to the computation time in a large subset of distributed applications. The Petri net formalism, however, can also be used to model a more general scenario in which communication times should be explicitly taken into account.…”
Section: Figurementioning
confidence: 72%
“…This section is meant to briefly illustrate the service-oriented architecture proposed in [4] for carrying out Distributed Data Mining (DDM) tasks over a Grid.…”
Section: Reference Architecture For Distributed Data Miningmentioning
confidence: 99%
“…Moreover, we replace the traditional (hard) clustering of the logics-based predictive clustering framework [3] used in [7,8] with a probabilistic clustering scheme, in order to reduce the risk of obtaining lowly accurate cluster predictors (due to the greedy clustering algorithm and to the underlying approximated representation of the log). In order to overcome the severe scalability limitations of [7,8] and make our approach suitable for large logs, both the computation of (probability-aware) trace clusters and of the clusters' predictors are implemented in a parallel and distributed manner, according to the Grid-services-based conceptual architecture defined in [4] for the specification and execution of Distributed Data Mining (DDM) tasks. The underlying grid services were developed according to the WSRF (Web Services Resource Framework) specifications of the WS-Core (a component of the Globus Toolkit 4 (GT4) [9]), and deployed onto a private Cloud-computing platform.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation