2009 10th IEEE/ACM International Conference on Grid Computing 2009
DOI: 10.1109/grid.2009.5353076
|View full text |Cite
|
Sign up to set email alerts
|

Finding associations in Grid monitoring data

Abstract: Error handling is a crucial task in infrastructures as complex as grids. Today, there are several monitoring tools which can be used to report failing grid jobs including corresponding error codes. However, the error codes do not always indicate the actual fault which originally caused the job failure. Human time and expertise is required to manually trace errors back to the real fault underlying an error. We perform Association Rule Mining on grid job monitoring data to automatically retrieve knowledge about … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
2
0

Year Published

2010
2010
2010
2010

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(2 citation statements)
references
References 14 publications
(12 reference statements)
0
2
0
Order By: Relevance
“…Beyond simple traces collection, many recent works have focused on post-analysis of the data archived [10], [11], [12]. In the AMon monitoring system [10], most relevant information on jobs submitted to EGEE is filtered out of the traces in order to help users to monitor experiments yielding large amounts of jobs.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Beyond simple traces collection, many recent works have focused on post-analysis of the data archived [10], [11], [12]. In the AMon monitoring system [10], most relevant information on jobs submitted to EGEE is filtered out of the traces in order to help users to monitor experiments yielding large amounts of jobs.…”
Section: Related Workmentioning
confidence: 99%
“…Their study is based on CONDOR and experiments have been made both on a local grid and on the Open Science Grid 3 . Maier et al [12] pointed out the fact that error codes returned by systems do not always properly identify the real cause of failure. They are using data mining techniques on EGEE traces in order to determine the root cause for faults.…”
Section: Related Workmentioning
confidence: 99%