2016
DOI: 10.1007/978-3-319-46227-1_31
|View full text |Cite
|
Sign up to set email alerts
|

Subgroup Discovery with Proper Scoring Rules

Abstract: General rightsThis document is made available in accordance with publisher policies. Please cite only the published version using the reference above. Abstract. Subgroup Discovery is the process of finding and describing sufficiently large subsets of a given population that have unusual distributional characteristics with regard to some target attribute. Such subgroups can be used as a statistical summary which improves on the default summary of stating the overall distribution in the population. A natural way… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
9
0

Year Published

2017
2017
2022
2022

Publication Types

Select...
4
2

Relationship

1
5

Authors

Journals

citations
Cited by 6 publications
(10 citation statements)
references
References 14 publications
0
9
0
Order By: Relevance
“…For constructing measures of model exceptionality, different options have been discussed and applied in Exceptional Model Mining literature. It has been proposed to use direct difference measures for the model parameters or structure [8], to apply bootstrap sampling over those differences [8,19], to use information theoretic measures [8], or to utilize likelihood-based approaches [21]. However, the respective functions are designed to quantify the exceptionality of a subgroup's induced model in a single dataset.…”
Section: Structure Of Interestingness Measures For Exceptional Modelmentioning
confidence: 99%
See 1 more Smart Citation
“…For constructing measures of model exceptionality, different options have been discussed and applied in Exceptional Model Mining literature. It has been proposed to use direct difference measures for the model parameters or structure [8], to apply bootstrap sampling over those differences [8,19], to use information theoretic measures [8], or to utilize likelihood-based approaches [21]. However, the respective functions are designed to quantify the exceptionality of a subgroup's induced model in a single dataset.…”
Section: Structure Of Interestingness Measures For Exceptional Modelmentioning
confidence: 99%
“…Often, exceptionality measures đť‘’đť‘Ą are parameterbased, i.e., they compute a distance function such as Manhattandistance or Euclidean-distance on the parameter values of the two models. As another direct comparison measure, we propose the Crossed Likelihood similarity, which we designed based on modelbased Subgroup Discovery [21]. It aggregates the average likelihood of points from either side being generated by the model from the other side:…”
Section: Interestingness Measuresmentioning
confidence: 99%
“…The quality measure ω tv can be extended to situations where the subgroup follows the same higher order Markov chain as the entire dataset, but it cannot be used in situations where the subgroup follows a different order model. Song et al (2015Song et al ( , 2016 propose what they call Model-Based Subgroup Discovery (MBSD) where the divergence between the target probability estimates and the true labels of an outcome variable is evaluated using Proper Scoring Rules (PSR) (Gneiting and Raftery 2007). We analyse sequential data without labels, but our evaluation measures are still related to those in Song et al (2016) since the information-theoretic scoring function AIC is derived from the Kullback-Leibler divergence (Burnham and Anderson 2004), which is associated with the logarithmic score as a PSR (Gneiting and Raftery 2007).…”
Section: Markov Chainsmentioning
confidence: 99%
“…Regarding data mining, we have also investigated approaches directly using the sensor data to detect potential behaviour changes of the households, without explicitly predicting variables like activities and locations. In [28] we proposed a method to find statistically abnormal subgroups by summarising different probabilistic models, and later in [27] we further demonstrated the proposed approach can be adopted to find abnormal spatio-temporal patterns from long-term sensor data within a home.…”
Section: Unsupervised Approachesmentioning
confidence: 99%