We present a case study of solar flare forecasting by means of metadata feature time series, by treating it as a prominent class-imbalance and temporally coherent problem. Taking full advantage of pre-flare time series in solar active regions is made possible via the Space Weather Analytics for Solar Flares (SWAN-SF) benchmark data set, a partitioned collection of multivariate time series of active region properties comprising 4075 regions and spanning over 9 yr of the Solar Dynamics Observatory period of operations. We showcase the general concept of temporal coherence triggered by the demand of continuity in time series forecasting and show that lack of proper understanding of this effect may spuriously enhance models’ performance. We further address another well-known challenge in rare-event prediction, namely, the class-imbalance issue. The SWAN-SF is an appropriate data set for this, with a 60:1 imbalance ratio for GOES M- and X-class flares and an 800:1 imbalance ratio for X-class flares against flare-quiet instances. We revisit the main remedies for these challenges and present several experiments to illustrate the exact impact that each of these remedies may have on performance. Moreover, we acknowledge that some basic data manipulation tasks such as data normalization and cross validation may also impact the performance; we discuss these problems as well. In this framework we also review the primary advantages and disadvantages of using true skill statistic and Heidke skill score, two widely used performance verification metrics for the flare-forecasting task. In conclusion, we show and advocate for the benefits of time series versus point-in-time forecasting, provided that the above challenges are measurably and quantitatively addressed.
In Fall 2008 NASA selected a large international consortium to produce a comprehensive automated feature-recognition system for the Solar Dynamics Observatory (SDO). The SDO data that we consider are all of the Atmospheric Imaging Assembly (AIA) images plus surface magnetic-field images from the Helioseismic and Magnetic Imager (HMI). We produce robust, very efficient, professionally coded software modules that can keep up with the SDO data stream and detect, trace, and analyze numerous phenomena, inThe Solar Dynamics Observatory P.C.H. Martens et al.cluding flares, sigmoids, filaments, coronal dimmings, polarity inversion lines, sunspots, Xray bright points, active regions, coronal holes, EIT waves, coronal mass ejections (CMEs), coronal oscillations, and jets. We also track the emergence and evolution of magnetic elements down to the smallest detectable features and will provide at least four full-disk, nonlinear, force-free magnetic field extrapolations per day. The detection of CMEs and filaments is accomplished with Solar and Heliospheric Observatory (SOHO)/Large Angle and Spectrometric Coronagraph (LASCO) and ground-based Hα data, respectively. A completely new software element is a trainable feature-detection module based on a generalized imageclassification algorithm. Such a trainable module can be used to find features that have not yet been discovered (as, for example, sigmoids were in the pre-Yohkoh era). Our codes will produce entries in the Heliophysics Events Knowledgebase (HEK) as well as produce complete catalogs for results that are too numerous for inclusion in the HEK, such as the X-ray bright-point metadata. This will permit users to locate data on individual events as well as carry out statistical studies on large numbers of events, using the interface provided by the Virtual Solar Observatory. The operations concept for our computer vision system is that the data will be analyzed in near real time as soon as they arrive at the SDO Joint Science Operations Center and have undergone basic processing. This will allow the system to produce timely space-weather alerts and to guide the selection and production of quicklook images and movies, in addition to its prime mission of enabling solar science. We briefly describe the complex and unique data-processing pipeline, consisting of the hardware and control software required to handle the SDO data stream and accommodate the computer-vision modules, which has been set up at the Lockheed-Martin Space Astrophysics Laboratory (LMSAL), with an identical copy at the Smithsonian Astrophysical Observatory (SAO).
In Fall 2008 NASA selected a large international consortium to produce a comprehensive automated feature-recognition system for the Solar Dynamics Observatory (SDO). The SDO data that we consider are all of the Atmospheric Imaging Assembly (AIA) images plus surface magnetic-field images from the Helioseismic and Magnetic Imager (HMI). We produce robust, very efficient, professionally coded software modules that can keep up with the SDO data stream and detect, trace, and analyze numerous phenomena, inThe Solar Dynamics Observatory Guest Editors: W. Dean Pesnell, Phillip C. Chamberlin, and Barbara J. Thompson. cluding flares, sigmoids, filaments, coronal dimmings, polarity inversion lines, sunspots, Xray bright points, active regions, coronal holes, EIT waves, coronal mass ejections (CMEs), coronal oscillations, and jets. We also track the emergence and evolution of magnetic elements down to the smallest detectable features and will provide at least four full-disk, nonlinear, force-free magnetic field extrapolations per day. The detection of CMEs and filaments is accomplished with Solar and Heliospheric Observatory (SOHO)/Large Angle and Spectrometric Coronagraph (LASCO) and ground-based Hα data, respectively. A completely new software element is a trainable feature-detection module based on a generalized imageclassification algorithm. Such a trainable module can be used to find features that have not yet been discovered (as, for example, sigmoids were in the pre-Yohkoh era). Our codes will produce entries in the Heliophysics Events Knowledgebase (HEK) as well as produce complete catalogs for results that are too numerous for inclusion in the HEK, such as the X-ray bright-point metadata. This will permit users to locate data on individual events as well as carry out statistical studies on large numbers of events, using the interface provided by the Virtual Solar Observatory. The operations concept for our computer vision system is that the data will be analyzed in near real time as soon as they arrive at the SDO Joint Science Operations Center and have undergone basic processing. This will allow the system to produce timely space-weather alerts and to guide the selection and production of quicklook images and movies, in addition to its prime mission of enabling solar science. We briefly describe the complex and unique data-processing pipeline, consisting of the hardware and control software required to handle the SDO data stream and accommodate the computer-vision modules, which has been set up at the Lockheed-Martin Space Astrophysics Laboratory (LMSAL), with an identical copy at the Smithsonian Astrophysical Observatory (SAO).
We introduce and make openly accessible a comprehensive, multivariate time series (MVTS) dataset extracted from solar photospheric vector magnetograms in Spaceweather HMI Active Region Patch (SHARP) series. Our dataset also includes a cross-checked NOAA solar flare catalog that immediately facilitates solar flare prediction efforts. We discuss methods used for data collection, cleaning and pre-processing of the solar active region and flare data, and we further describe a novel data integration and sampling methodology. Our dataset covers 4,098 MVTS data collections from active regions occurring between May 2010 and December 2018, includes 51 flare-predictive parameters, and integrates over 10,000 flare reports. Potential directions toward expansion of the time series, either “horizontally” – by adding more prediction-specific parameters, or “vertically” – by generalizing flare into integrated solar eruption prediction, are also explained. The immediate tasks enabled by the disseminated dataset include: optimization of solar flare prediction and detailed investigation for elusive flare predictors or precursors, with both operational (research-to-operations), and basic research (operations-to-research) benefits potentially following in the future.
Spatio-temporal co-occurring patterns represent subsets of event types that occur together in both space and time. In comparison to previous work in this field, we present a general framework to identify spatio-temporal cooccurring patterns for continuously evolving spatio-temporal events that have polygon-like representations. We also propose a set of measures to identify spatio-temporal co-occurring patterns and propose an Apriori-based spatio-temporal cooccurrence mining algorithm to find prevalent spatio-temporal co-occurring patterns for extended spatial representations that evolve over time. We evaluate our framework on real-life data to demonstrate the effectiveness of our measures and the algorithm. We present results highlighting the importance of our measures in identifying spatio-temporal co-occurrence patterns.
In analyses of rare-events, regardless of the domain of application, class-imbalance issue is intrinsic. Although the challenges are known to data experts, their explicit impact on the analytic and the decisions made based on the findings are often overlooked. This is in particular prevalent in interdisciplinary research where the theoretical aspects are sometimes overshadowed by the challenges of the application. To show-case these undesirable impacts, we conduct a series of experiments on a recently created benchmark data, named Space Weather ANalytics for Solar Flares (SWAN-SF). This is a multivariate time series dataset of magnetic parameters of active regions. As a remedy for the imbalance issue, we study the impact of data manipulation (undersampling and oversampling) and model manipulation (using class weights). Furthermore, we bring to focus the auto-correlation of time series that is inherited from the use of sliding window for monitoring flares' history. Temporal coherence, as we call this phenomenon, invalidates the randomness assumption, thus impacting all sampling practices including different cross-validation techniques. We illustrate how failing to notice this concept could give an artificial boost in the forecast performance and result in misleading findings. Throughout this study we utilized Support Vector Machine as a classifier, and True Skill Statistics as a verification metric for comparison of experiments. We conclude our work by specifying the correct practice in each case, and we hope that this study could benefit researchers in other domains where time series of rare events are of interest.
The National Aeronautics Space Agency (NASA) Solar Dynamics Observatory (SDO) mission has given us unprecedented insight into the Sun’s activity. By capturing approximately 70,000 images a day, this mission has created one of the richest and biggest repositories of solar image data available to mankind. With such massive amounts of information, researchers have been able to produce great advances in detecting solar events. In this resource, we compile SDO solar data into a single repository in order to provide the computer vision community with a standardized and curated large-scale dataset of several hundred thousand solar events found on high resolution solar images. This publicly available resource, along with the generation source code, will accelerate computer vision research on NASA’s solar image data by reducing the amount of time spent performing data acquisition and curation from the multiple sources we have compiled. By improving the quality of the data with thorough curation, we anticipate a wider adoption and interest from the computer vision to the solar physics community.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.