2017
DOI: 10.1613/jair.5440
|View full text |Cite
|
Sign up to set email alerts
|

Confidence Decision Trees via Online and Active Learning for Streaming Data

Abstract: Decision tree classifiers are a widely used tool in data stream mining. The use of confidence intervals to estimate the gain associated with each split leads to very effective methods, like the popular Hoeffding tree algorithm. From a statistical viewpoint, the analysis of decision tree classifiers in a streaming setting requires knowing when enough new information has been collected to justify splitting a leaf. Although some of the issues in the statistical analysis of Hoeffding trees have been already clarif… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

1
10
0

Year Published

2019
2019
2020
2020

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 13 publications
(11 citation statements)
references
References 29 publications
1
10
0
Order By: Relevance
“…This limit case is more frequent for small numbers of clients. Beside this, we can observe that the accuracy is stable at about 61% which is an excellent level of performance comparable to state-of-the-art results in the non-distributed case with [16] reporting about 62% of accuracy with a similar request budget. This validates the potential of the approach we propose for communication-efficient distributed learning.…”
supporting
confidence: 74%
See 4 more Smart Citations
“…This limit case is more frequent for small numbers of clients. Beside this, we can observe that the accuracy is stable at about 61% which is an excellent level of performance comparable to state-of-the-art results in the non-distributed case with [16] reporting about 62% of accuracy with a similar request budget. This validates the potential of the approach we propose for communication-efficient distributed learning.…”
supporting
confidence: 74%
“…Examples of this approach can be found in [14] for an application to document classification and in [15] for an application to sentiment analysis. [16] is an article of particular interest to this article as it proposes a new way to train decision trees using an active learning approach that focuses on minimizing the risk of selecting a sub-optimal split when a new leaf is added to the tree by computing confidence intervals. To put it differently, when a new leaf is added to the tree, the comparison is made between the ideal decision tree that could have been made if the true distribution of the samples were known and the expected performance of the current tree.…”
Section: B Active Learningmentioning
confidence: 99%
See 3 more Smart Citations