Gleaner: Creating ensembles of first-order clauses to improve recall-precision curves

Goadrich, Mark; Oliphant, Louis; Shavlik, Jude W.

doi:10.1007/s10994-006-8958-3

Cited by 41 publications

(31 citation statements)

References 37 publications

(34 reference statements)

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…Interpolation Estimators As suggested by Davis and Goadrich [6] and Goadrich et al [1], we use PR space interpolation as the basis for several estimators. These methods use the non-linear interpolation between known points in PR space derived from a linear interpolation in ROC space.…”

Section: Point Estimatorsmentioning

confidence: 99%

“…These methods use the non-linear interpolation between known points in PR space derived from a linear interpolation in ROC space. Davis and Goadrich [6] and Goadrich et al [1] examine the interpolation in terms of the number of true positives and false positives corresponding to each PR point. Here we perform the same interpolation, but use the recall and precision of the PR points directly, which leads to the surprising observation that the interpolation (from the same PR points) does not depend on π. Theorem 1.…”

Section: Point Estimatorsmentioning

confidence: 99%

“…PR curves are increasingly used in the machine learning community, particularly for imbalanced data sets where one class is observed more frequently than the other class. On these imbalanced or skewed data sets, PR curves are a useful alternative to ROC curves that can highlight performance differences that are lost in ROC curves [1]. Besides visual inspection of a PR curve, algorithm assessment often uses the area under a PR curve (AUCPR) as a general measure of performance irrespective of any particular threshold or operating point (e.g., [2,3,4,5]).…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Area under the Precision-Recall Curve: Point Estimates and Confidence Intervals

Boyd

Eng

Page

2013

Lecture Notes in Computer Science

314

233

View full text Add to dashboard Cite

Abstract. The area under the precision-recall curve (AUCPR) is a single number summary of the information in the precision-recall (PR) curve. Similar to the receiver operating characteristic curve, the PR curve has its own unique properties that make estimating its enclosed area challenging. Besides a point estimate of the area, an interval estimate is often required to express magnitude and uncertainty. In this paper we perform a computational analysis of common AUCPR estimators and their confidence intervals. We find both satisfactory estimates and invalid procedures and we recommend two simple intervals that are robust to a variety of assumptions.

show abstract

Section: Point Estimatorsmentioning

confidence: 99%

Section: Point Estimatorsmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Area under the Precision-Recall Curve: Point Estimates and Confidence Intervals

Boyd

Eng

Page

2013

Lecture Notes in Computer Science

314

233

View full text Add to dashboard Cite

show abstract

“…One method of developing more complex rstorder models is to retain more than just the best clause found during search. Taking an idea from the Gleaner algorithm [9] which retains an entire set of rules found during search that span the range of recall values, we have developed a second weak learner that retains a set of the best rules found during search. This weak learner, PRankBoost.Path, contains all rules along the path from the most general rule to the highest-scoring rule found during search.…”

Section: Weak Learnersmentioning

confidence: 99%

“…In the Inductive Logic Programming [6] domain ensembles have been successfully used to increase performance [5,9,10]. Successful ensemble approaches must both learn individual classiers that work well with a set of other classiers as well as combine those classiers in a way that maximizes performance.…”

Section: Introductionmentioning

confidence: 99%

Boosting First-Order Clauses for Large, Skewed Data Sets

Oliphant

Burnside

Shavlik

2010

Inductive Logic Programming

Self Cite

View full text Add to dashboard Cite

Abstract. Creating an eective ensemble of clauses for large, skewed data sets requires nding a diverse, high-scoring set of clauses and then combining them in such a way as to maximize predictive performance. We have adapted the RankBoost algorithm in order to maximize area under the recall-precision curve, a much better metric when working with highly skewed data sets than ROC curves. We have also explored a range of possibilities for the weak hypotheses used by our modied RankBoost algorithm beyond using individual clauses. We provide results on four large, skewed data sets showing that our modied RankBoost algorithm outperforms the original on area under the recall-precision curves.

show abstract

Out-of-Distribution Detection Using an Ensemble of Self Supervised Leave-Out Classifiers

Vyas

Jammalamadaka

Zhu

et al. 2018

Lecture Notes in Computer Science

181

150

View full text Add to dashboard Cite

As deep learning methods form a critical part in commercially important applications such as autonomous driving and medical diagnostics, it is important to reliably detect out-of-distribution (OOD) inputs while employing these algorithms. In this work, we propose an OOD detection algorithm which comprises of an ensemble of classifiers. We train each classifier in a self-supervised manner by leaving out a random subset of training data as OOD data and the rest as in-distribution (ID) data. We propose a novel margin-based loss over the softmax output which seeks to maintain at least a margin m between the average entropy of the OOD and in-distribution samples. In conjunction with the standard cross-entropy loss, we minimize the novel loss to train an ensemble of classifiers. We also propose a novel method to combine the outputs of the ensemble of classifiers to obtain OOD detection score and class prediction. Overall, our method convincingly outperforms Hendrycks et al.[7] and the current state-of-the-art ODIN [13] on several OOD detection benchmarks.

show abstract

Gleaner: Creating ensembles of first-order clauses to improve recall-precision curves

Cited by 41 publications

References 37 publications

Area under the Precision-Recall Curve: Point Estimates and Confidence Intervals

Area under the Precision-Recall Curve: Point Estimates and Confidence Intervals

Boosting First-Order Clauses for Large, Skewed Data Sets

Out-of-Distribution Detection Using an Ensemble of Self Supervised Leave-Out Classifiers

Contact Info

Product

Resources

About