In this paper, we present the FATS (Feature Analysis for Time Series) library. FATS is a Python library which facilitates and standardizes feature extraction for time series data. In particular, we focus on one application: feature extraction for astronomical light curve data, although the library is generalizable for other uses. We detail the methods and features implemented for light curve analysis, and present examples for its usage.
We present a new classification method for quasar identification in the EROS‐2 and MACHO data sets based on a boosted version of a random forest classifier. We use a set of variability features including parameters of a continuous autoregressive model. We prove that continuous autoregressive parameters are very important discriminators in the classification process. We create two training sets (one for EROS‐2 and one for MACHO data sets) using known quasars found in the Large Magellanic Cloud (LMC). Our model's accuracy in both EROS‐2 and MACHO training sets is about 90 per cent precision and 86 per cent recall, improving the state‐of‐the‐art models, accuracy in quasar detection. We apply the model on the complete, including 28 million objects, EROS‐2 and MACHO LMC data sets, finding 1160 and 2551 candidates, respectively. To further validate our list of candidates, we cross‐matched our list with 663 previously known strong candidates, getting 74 per cent of matches for MACHO and 40 per cent in EROS. The main difference on matching level is because EROS‐2 is a slightly shallower survey which translates to significantly lower signal‐to‐noise ratio light curves.
The success of automatic classification of variable stars strongly depends on the lightcurve representation. Usually, lightcurves are represented as a vector of many statistical descriptors designed by astronomers called features. These descriptors commonly demand significant computational power to calculate, require substantial research effort to develop and do not guarantee good performance on the final classification task. Today, lightcurve representation is not entirely automatic; algorithms that extract lightcurve features are designed by humans and must be manually tuned up for every survey. The vast amounts of data that will be generated in future surveys like LSST mean astronomers must develop analysis pipelines that are both scalable and automated. Recently, substantial efforts have been made in the machine learning community to develop methods that prescind from expert-designed and manually tuned features for features that are automatically learned from data. In this work we present what is, to our knowledge, the first unsupervised feature learning algorithm designed for variable objects. Our method works by extracting a large number of lightcurve subsequences from a given set of photometric data, which are then clustered to find common local patterns in the time series. Representatives of these common patterns, called exemplars, are then used to transform lightcurves of a labeled set into a new representation that can then be used to train an automatic classifier. The proposed algorithm learns the features from both labeled and unlabeled lightcurves, overcoming the bias generated when the learning process is done only with labeled lightcurves. We test our method on MACHO and OGLE datasets; the results show that the classification performance we achieve is as good and in some cases better than the performance achieved using traditional statistical features, while the computational cost is significantly lower. With these promising results, we believe that our method constitutes a significant step towards the automatization of the lightcurve classification pipeline.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.