We present a new interactive data exploration approach, called Semantic Windows (SW), in which users query for multidimensional "windows" of interest via standard DBMSstyle queries enhanced with exploration constructs. Users can specify SWs using (i) shape-based properties, e.g., "identify all 3-by-3 windows", as well as (ii) content-based properties, e.g., "identify all windows in which the average brightness of stars exceeds 0.8". This SW approach enables the interactive processing of a host of useful exploratory queries that are difficult to express and optimize using standard DBMS techniques.SW uses a sampling-guided, data-driven search strategy to explore the underlying data set and quickly identify windows of interest. To facilitate human-in-the-loop style interactive processing, SW is optimized to produce online results during query execution. To control the tension between online performance and query completion time, it uses a tunable, adaptive prefetching technique. To enable exploration of big data, the framework supports distributed computation.We describe the semantics and implementation of SW as a distributed layer on top of PostgreSQL. The experimental results with real astronomical and artificial data reveal that SW can offer online results quickly and continuously with little or no degradation in query completion times.
Searchlight [9] enables search and exploration of large, multidimensional data sets interactively. It allows users to explore by specifying rich constraints for the "objects" they are interested in identifying. Constraints can express a variety of properties, including a shape of the object (e.g., a waveform interval of length 10-100ms), its aggregate properties (e.g., the average amplitude of the signal over the interval is greater than 10), and similarity to another object (e.g., the distance between the interval's waveform and the query waveform is less than 5). Searchlight allows users to specify an arbitrary number of such constraints, with mixing different types of constraints in the same query.Searchlight enhances the query execution engine of an array DBMS (currently SciDB) with the ability to perform sophisticated search using the power of Constraint Programming (CP). This allows an existing CP solver from Or-Tools (an open-source suite of operations research tools from Google) to directly access data inside the DBMS without the need to extract and transform it.This demo will illustrate the rich search and exploration capabilities of Searchlight, and its innovative technical features, by using the real-world MIMIC II data set, which contains waveform data for multi-parameter recordings of ICU patients, such as ABP (Arterial Blood Pressure) and ECG (electrocardiogram). Users will be able to search for interesting waveform intervals by specifying aggregate properties of the corresponding signals. In addition, they will be able to search for intervals similar to already found, where similarity is defined as a distance between the signal sequences.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.