Jun Yang scite author profile

Our news are saturated with claims of "facts" made from data. Database research has in the past focused on how to answer queries, but has not devoted much attention to discerning more subtle qualities of the resulting claims, e.g., is a claim "cherry-picking"? This paper proposes a framework that models claims based on structured data as parameterized queries. A key insight is that we can learn a lot about a claim by perturbing its parameters and seeing how its conclusion changes. This framework lets us formulate practical fact-checking tasks-reverse-engineering (often intentionally) vague claims, and countering questionable claims-as computational problems. Along with the modeling framework, we develop an algorithmic framework that enables efficient instantiations of "meta" algorithms by supplying appropriate algorithmic building blocks. We present real-world examples and experiments that demonstrate the power of our model, efficiency of our algorithms, and usefulness of their results.

show abstract

Incremental computation and maintenance of temporal aggregates

Yang

Widom

2003

The VLDB Journal The International Journal on Very Large Data B

100

View full text Add to dashboard Cite

Task Allocation for Wireless Sensor Network Using Modified Binary Particle Swarm Optimization

et al. 2014

View full text Add to dashboard Cite

I/O-efficient statistical computing with RIOT

Zhang

Yang

2010

View full text Add to dashboard Cite

R is a numerical computing environment that is widely popular for statistical data analysis. Like many such environments, R performs poorly for large datasets whose sizes exceed that of physical memory. We present our vision of RIOT (R with I/O Transparency), a system that makes R programs I/O-efficient in a way transparent to the users. We describe our experience with RIOT-DB, an initial prototype that uses a relational database system as a backend. Despite the overhead and inadequacy of generic database systems in handling array data and numerical computation, RIOT-DB significantly outperforms R in many large-data scenarios, thanks to a suite of high-level, inter-operation optimizations that integrate seamlessly into R. While many techniques in RIOT are inspired by databases (and, for RIOT-DB, realized by a database system), RIOT users are insulated from anything database related. Compared with previous approaches that require users to learn new languages and rewrite their programs to interface with a database, RIOT will, we believe, be easier to adopt by the majority of the R users.

show abstract

New order preserving encryption model for outsourced databases in cloud environments

Liu

Chen

Yang

et al. 2016

Journal of Network and Computer Applications

View full text Add to dashboard Cite

Processing a large number of continuous preference top- k queries

Agarwal

Yang

2012

View full text Add to dashboard Cite

Given a set of objects, each with multiple numeric attributes, a (preference) top-k query retrieves the k objects with the highest scores according to a user preference, defined as a linear combination of attribute values. We consider the problem of processing a large number of continuous top-k queries, each with its own preference. When objects or user preferences change, the query results must be updated. We present a dynamic index that supports the reverse top-k query, which is of independent interest. Combining this index with another one for top-k queries, we develop a scalable solution for processing many continuous top-k queries that exploits the clusteredness in user preferences. We also define an approximate version of the problem and present a solution significantly more efficient than the exact one with little loss in accuracy.according his or her preference, i.e., those with the highest results for the linear combination. A user who cares most about the size of living area may assign the largest weight to this attribute (assuming that values of different attributes have been appropriately normalized relative to each other). On the other hand, a user who enjoys a yard more than indoor space may give the lot size a larger weight than the size of the living area. Because of the wide range of applications, there has been a lot of work on preference top-k queries [15,17,18,19,27].

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

334 Leonard St

Brooklyn, NY 11211

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Jun Yang

A Sampling-Based Approach to Optimizing Top-k Queries in Sensor Networks

Sparse random ultrasound phased array for focal surgery

Toward computational fact-checking

Incremental computation and maintenance of temporal aggregates

Task Allocation for Wireless Sensor Network Using Modified Binary Particle Swarm Optimization

I/O-efficient statistical computing with RIOT

New order preserving encryption model for outsourced databases in cloud environments

Processing a large number of continuous preference top- k queries

Contact Info

Product

Resources

About