2014
DOI: 10.48550/arxiv.1402.4293
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

The Random Forest Kernel and other kernels for big data from random partitions

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
30
0

Year Published

2016
2016
2024
2024

Publication Types

Select...
4
2
1
1

Relationship

0
8

Authors

Journals

citations
Cited by 19 publications
(32 citation statements)
references
References 0 publications
0
30
0
Order By: Relevance
“…MAPLE fits a regression forest to the outputs of a black-box model, and then uses a feature importance selector called DSTUMP (Kazemitabar et al 2017) to select the most important features. When an explanation is desirable, MAPLE uses SILO (Bloniarz et al 2016), a local linear modeling technique that uses random forests to identify supervised neighbors (Davies and Ghahramani 2014;He et al 2014), to generate a prediction. Specifically, given an instance to predict x t , SILO generates a local training distribution based on how often a training instance x i ends at the same terminal node as x t .…”
Section: Sample-based Explanationsmentioning
confidence: 99%
“…MAPLE fits a regression forest to the outputs of a black-box model, and then uses a feature importance selector called DSTUMP (Kazemitabar et al 2017) to select the most important features. When an explanation is desirable, MAPLE uses SILO (Bloniarz et al 2016), a local linear modeling technique that uses random forests to identify supervised neighbors (Davies and Ghahramani 2014;He et al 2014), to generate a prediction. Specifically, given an instance to predict x t , SILO generates a local training distribution based on how often a training instance x i ends at the same terminal node as x t .…”
Section: Sample-based Explanationsmentioning
confidence: 99%
“…Another approach is to use random partitions [5]. Random partitions adopt a different approach in the sense that the method strives to infer the model from the training instances only, without any prior formulation of the measure or any similarity constraints.…”
Section: Learning Dissimilarity Representationsmentioning
confidence: 99%
“…The key idea of random partitions is to define multiple randomized partitions of the input space in such a way it forms homogeneous groups (clusters) of instances. It has been proven that such random partitions can be used to define kernels, which can be viewed as a (dis)similarity measurement [5], [6]. Beyond this mathematical demonstrations, random partitions can be directly used in practice to measure similarities, as with the well-known proximity measurement of random forests [4], [9], [10].…”
Section: Learning Dissimilarity Representationsmentioning
confidence: 99%
See 1 more Smart Citation
“…Due to the intrinsic tree building process, random forest estimators can easily handle both univariate and multivariate data with few parameters to tune. Besides, these methods have good predictive power and can outperform standard kernel methods (Davies and Ghahramani, 2014;Scornet, 2016c). Lastly, being based on the random forest algorithm, they are also easily parallelizable and can handle large dataset.…”
Section: Introductionmentioning
confidence: 99%