1996
DOI: 10.1007/3-540-61442-7_8
|View full text |Cite
|
Sign up to set email alerts
|

Speeding up knowledge discovery in large relational databases by means of a new discretization algorithm

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
6
0

Year Published

1999
1999
2009
2009

Publication Types

Select...
4
4

Relationship

0
8

Authors

Journals

citations
Cited by 11 publications
(6 citation statements)
references
References 7 publications
0
6
0
Order By: Relevance
“…Thus they are expected to outperform previous methods especially when learning from large data. It is desirable that a machine learning algorithm maximize the information that it derives from large data sets, since increasing the size of a data set can provide a domain-independent way of achieving higher accuracy (Freitas and Lavington 1996;Provost and Aronis 1996). This is especially important since large data sets with high dimensional attribute spaces and huge numbers of instances are increasingly used in real-world applications, and naive-Bayes classifiers are particularly attractive to theses applications because of their space and time efficiency.…”
Section: Resultsmentioning
confidence: 99%
“…Thus they are expected to outperform previous methods especially when learning from large data. It is desirable that a machine learning algorithm maximize the information that it derives from large data sets, since increasing the size of a data set can provide a domain-independent way of achieving higher accuracy (Freitas and Lavington 1996;Provost and Aronis 1996). This is especially important since large data sets with high dimensional attribute spaces and huge numbers of instances are increasingly used in real-world applications, and naive-Bayes classifiers are particularly attractive to theses applications because of their space and time efficiency.…”
Section: Resultsmentioning
confidence: 99%
“…Then, using the statistical χ 2 test, the pair with the lowest χ 2 value is merged into one interval, and this process is repeated until no intervals have a lower value than the predetermined threshold value of χ 2 (Kerber 1992;Freitas and Lavington 1996).…”
Section: Chimergementioning
confidence: 99%
“…We have chosen supervised techniques because using classi cation information we can reduce the probability of grouping di erent classes in the same AL96], an information-theoretic algorithm, that substitutes ChiMerge /StatDisc statistical measures with an information loss function in a bottom-up iterative process. This approach is similar to C4.5 local discretization process but in order to apply it into a global algorithm a correction factor need to be used.…”
Section: Discretization Algorithmmentioning
confidence: 99%