Despite the progress within the last decades, weather forecasting is still a challenging and computationally expensive task. Current satellite-based approaches to predict thunderstorms are usually based on the analysis of the observed brightness temperatures in different spectral channels and emit a warning if a critical threshold is reached. Recent progress in data science however demonstrates that machine learning can be successfully applied to many research fields in science, especially in areas dealing with large datasets. We therefore present a new approach to the problem of predicting thunderstorms based on machine learning. The core idea of our work is to use the error of two-dimensional optical flow algorithms applied to images of meteorological satellites as a feature for machine learning models. We interpret that optical flow error as an indication of convection potentially leading to thunderstorms and lightning. To factor in spatial proximity we use various manual convolution steps. We also consider effects such as the time of day or the geographic location. We train different tree classifier models as well as a neural network to predict lightning within the next few hours (called nowcasting in meteorology) based on these features. In our evaluation section we compare the predictive power of the different models and the impact of different features on the classification result. Our results show a high accuracy of 96% for predictions over the next 15 minutes which slightly decreases with increasing forecast period but still remains above 83% for forecasts of up to five hours. The high false positive rate of nearly 6% however needs further investigation to allow for an operational use of our approach. CCS CONCEPTS • Computing methodologies → Classification and regression trees; Neural networks; • Applied computing → Earth and atmospheric sciences.
Index structures are a building block of query processing and computer science in general. Since the dawn of computer technology there have been index structures. And since then, a myriad of index structures are being invented and published each and every year. In this paper we argue that the very idea of "inventing an index" is a misleading concept in the first place. It is the analogue of "inventing a physical query plan". This paper is a paradigm shift in which we propose to drop the idea to handcraft index structures (as done for binary search trees over B-trees to any form of learned index) altogether. We present a new automatic index breeding framework coined Genetic Generic Generation of Index Structures (GENE) . It is based on the observation that almost all index structures are assembled along three principal dimensions: (1) structural building blocks, e.g., a B-tree is assembled from two different structural node types (inner and leaf nodes), (2) a couple of invariants, e.g., for a B-tree all paths have the same length, and (3) decisions on the internal layout of nodes (row or column layout, etc.). We propose a generic indexing framework that can mimic many existing index structures along those dimensions. Based on that framework we propose a generic genetic index generation algorithm that, given a workload and an optimization goal, can automatically assemble and mutate, in other words 'breed' new index structure 'species'. In our experiments we follow multiple goals. We reexamine some good old wisdom from database technology. Given a specific workload, will GENE even breed an index that is equivalent to what our textbooks and papers currently recommend for such a workload? Or can we do even more? Our initial results strongly indicate that generated indexes are the next step in designing index structures.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.