Kexin Rong scite author profile

Kexin Rong

5Publications

32Citation Statements Received

70Citation Statements Given

How they've been cited

158

How they cite others

289

Affiliations

Georgia Institute of Technology, Stanford University, Zhejiang Normal University

Publications

Order By: Most citations

MacroBase

Abuzaid¹,

Bailis²,

Ding³

et al. 2017

View full text Add to dashboard Cite

As data volumes continue to rise, manual inspection is becoming increasingly untenable. In response, we present MacroBase, a data analytics engine that prioritizes end-user attention in high-volume fast data streams. MacroBase enables efficient, accurate, and modular analyses that highlight and aggregate important and unusual behavior, acting as a search engine for fast data. MacroBase is able to deliver order-of-magnitude speedups over alternatives by optimizing the combination of explanation and classification tasks and by leveraging a new reservoir sampler and heavy-hitters sketch specialized for fast data streams. As a result, MacroBase delivers accurate results at speeds of up to 2M events per second per query on a single core. The system has delivered meaningful results in production, including at a telematics company monitoring hundreds of thousands of vehicles.

show abstract

Locality-sensitive hashing for earthquake detection

et al. 2018

View full text Add to dashboard Cite

In this work, we report on a novel application of Locality Sensitive Hashing (LSH) to seismic data at scale. Based on the high waveform similarity between reoccurring earthquakes, our application identifies potential earthquakes by searching for similar time series segments via LSH. However, a straightforward implementation of this LSH-enabled application has difficulty scaling beyond 3 months of continuous time series data measured at a single seismic station. As a case study of a data-driven science workflow, we illustrate how domain knowledge can be incorporated into the workload to improve both the efficiency and result quality. We describe several end-to-end optimizations of the analysis pipeline from pre-processing to post-processing, which allow the application to scale to time series data measured at multiple seismic stations. Our optimizations enable an over 100× speedup in the end-to-end analysis pipeline. This improved scalability enabled seismologists to perform seismic analysis on more than ten years of continuous time series data from over ten seismic stations, and has directly enabled the discovery of 597 new earthquakes near the Diablo Canyon nuclear power plant in California and 6123 new earthquakes in New Zealand.

show abstract

Asap

Rong¹,

Bailis²

2017

Proc. VLDB Endow.

View full text Add to dashboard Cite

Time series visualization of streaming telemetry (i.e., charting of key metrics such as server load over time) is increasingly prevalent in modern data platforms and applications. However, many existing systems simply plot the raw data streams as they arrive, often obscuring large-scale trends due to small-scale noise. We propose an alternative: to better prioritize end users' attention, smooth time series visualizations as much as possible to remove noise, while retaining large-scale structure to highlight significant deviations. We develop a new analytics operator called ASAP that automatically smooths streaming time series by adaptively optimizing the trade-off between noise reduction (i.e., variance) and trend retention (i.e., kurtosis). We introduce metrics to quantitatively assess the quality of smoothed plots and provide an efficient search strategy for optimizing these metrics that combines techniques from stream processing, user interface design, and signal processing via autocorrelation-based pruning, pixel-aware preaggregation, and ondemand refresh. We demonstrate that ASAP can improve users' accuracy in identifying long-term deviations in time series by up to 38.4% while reducing response times by up to 44.3%. Moreover, ASAP delivers these results several orders of magnitude faster than alternative search strategies.

show abstract

Cyberbullying experience and bystander behavior in cyberbullying incidents: The serial mediating roles of perceived incident severity and empathy

Zhao

Chu

Rong

2023

Computers in Human Behavior

View full text Add to dashboard Cite

Unsupervised Large‐Scale Search for Similar Earthquake Signals

Yoon

Bergen

Rong

et al. 2019

View full text Add to dashboard Cite

Seismology has continuously recorded ground‐motion spanning up to decades. Blind, uninformed search for similar‐signal waveforms within this continuous data can detect small earthquakes missing from earthquake catalogs, yet doing so with naive approaches is computationally infeasible. We present results from an improved version of the Fingerprint And Similarity Thresholding (FAST) algorithm, an unsupervised data‐mining approach to earthquake detection, now available as open‐source software. We use FAST to search for small earthquakes in 6–11 yr of continuous data from 27 channels over an 11‐station local seismic network near the Diablo Canyon nuclear power plant in central California. FAST detected 4554 earthquakes in this data set, with a 7.5% false detection rate: 4134 of the detected events were previously cataloged earthquakes located across California, and 420 were new local earthquake detections with magnitudes −0.3≤ML≤2.4, of which 224 events were located near the seismic network. Although seismicity rates are low, this study confirms that nearby faults are active. This example shows how seismology can leverage recent advances in data‐mining algorithms, along with improved computing power, to extract useful additional earthquake information from long‐duration continuous data sets.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Kexin Rong

MacroBase

Locality-sensitive hashing for earthquake detection

Asap

Cyberbullying experience and bystander behavior in cyberbullying incidents: The serial mediating roles of perceived incident severity and empathy

Unsupervised Large‐Scale Search for Similar Earthquake Signals

Contact Info

Product

Resources

About