2011
DOI: 10.1088/0004-637x/744/2/192
|View full text |Cite
|
Sign up to set email alerts
|

Active Learning to Overcome Sample Selection Bias: Application to Photometric Variable Star Classification

Abstract: Despite the great promise of machine-learning algorithms to classify and predict astrophysical parameters for the vast numbers of astrophysical sources and transients observed in large-scale surveys, the peculiarities of the training data often manifest as strongly biased predictions on the data of interest. Typically, training sets are derived from historical surveys of brighter, more nearby objects than those from more extensive, deeper surveys (testing data). This sample selection bias can cause catastrophi… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
69
0

Year Published

2013
2013
2023
2023

Publication Types

Select...
7
1

Relationship

1
7

Authors

Journals

citations
Cited by 69 publications
(69 citation statements)
references
References 46 publications
0
69
0
Order By: Relevance
“…Recent development of modern machine learning techniques and high performance computing architectures have made possible the efficient execution of automated probabilistic multi-class classification of very large datasets in reasonable time frames (Debosscher et al, 2007;Sarro et al, 2009aSarro et al, , 2009bDebosscher et al, 2009;Richards et al, 2011Richards et al, , 2012Blomme et al, 2011;Matijevič et al, 2012;Morgan et al, 2012;Long et al, 2012). An essential step in the development of such a framework is processing a large set of events of known class to train the classifier.…”
Section: Discussionmentioning
confidence: 99%
“…Recent development of modern machine learning techniques and high performance computing architectures have made possible the efficient execution of automated probabilistic multi-class classification of very large datasets in reasonable time frames (Debosscher et al, 2007;Sarro et al, 2009aSarro et al, , 2009bDebosscher et al, 2009;Richards et al, 2011Richards et al, , 2012Blomme et al, 2011;Matijevič et al, 2012;Morgan et al, 2012;Long et al, 2012). An essential step in the development of such a framework is processing a large set of events of known class to train the classifier.…”
Section: Discussionmentioning
confidence: 99%
“…Richards et al 2011aRichards et al , 2012Masci et al 2014). Several improvements have been proposed, in areas such as parametrizing light curves with maximal information retention (Kügler, Gianniotis & Polsterer 2015), and adjusting for training set deficiencies (Richards et al 2011b). One method of unsupervised machine learning is a Kohonen Self-Organizing Map (SOM;Kohonen 1990) demonstrated by Brett, West & Wheatley (2004) in an astronomical context.…”
Section: Introductionmentioning
confidence: 99%
“…Cases like these include spam detection, diagnosis of patients based on images and morphological classification of galaxies [1]. Although it might be expensive to train experts to a degree one can trust their labels [2], systems such as Amazon Mechanical Turk [3] allow each sample unit to be classified by many (not necessarily perfect) experts by a reasonably small cost. These experts do not have to be people.…”
Section: Introductionmentioning
confidence: 99%