Xiating Ouyang scite author profile

Xiating Ouyang

5Publications

8Citation Statements Received

88Citation Statements Given

How they've been cited

How they cite others

142

Affiliations

University of Wisconsin–Madison, Hong Kong Polytechnic University

Publications

Order By: Most citations

Consistent Query Answering for Primary Keys on Path Queries

Koutris

Ouyang

Wijsen

2021

View full text Add to dashboard Cite

We study the data complexity of consistent query answering (CQA) on databases that may violate the primary key constraints. A repair is a maximal subset of the database satisfying the primary key constraints. For a Boolean query q, the problem CERTAINTY(q) takes a database as input, and asks whether or not each repair satisfies q. The computational complexity of CERTAINTY(q) has been established whenever q is a selfjoin-free Boolean conjunctive query, or a (not necessarily self-join-free) Boolean path query. In this paper, we take one more step towards a general classification for all Boolean conjunctive queries by considering the class of rooted tree queries. In particular, we show that for every rooted tree query q, CERTAINTY(q) is in FO, NL-hard ∩ LFP, or coNP-complete, and it is decidable (in polynomial time), given q, which of the three cases applies. We also extend our classification to larger classes of queries with simple primary keys. Our classification criteria rely on query homomorphisms and our polynomial-time fixpoint algorithm is based on a novel use of context-free grammar (CFG).

show abstract

Unit interval vertex deletion: Fewer vertices are relevant

Cao

Ouyang

et al. 2018

Journal of Computer and System Sciences

View full text Add to dashboard Cite

The unit interval vertex deletion problem asks for a set of at most k vertices whose deletion from an n-vertex graph makes it a unit interval graph. We develop an O(k 4 )-vertex kernel for the problem, significantly improving the O(k 53 )-vertex kernel of Fomin, Saurabh, and Villanger [ESA'12; SIAM J. Discrete Math 27 (2013)]. We introduce a novel way of organizing cliques of a unit interval graph. Our constructive proof for the correctness of our algorithm, using interval models, greatly simplifies the destructive proofs, based on forbidden induced subgraphs, for similar problems in literature.

show abstract

Unit Interval Vertex Deletion: Fewer Vertices are Relevant

Ke¹,

Cao²,

Ouyang³

et al. 2016

Preprint

View full text Add to dashboard Cite

LinCQA: Faster Consistent Query Answering with Linear Time Guarantees

Fan

Koutris

Ouyang

et al. 2023

Proc. ACM Manag. Data

View full text Add to dashboard Cite

Most data analytical pipelines often encounter the problem of querying inconsistent data that violate pre-determined integrity constraints. Data cleaning is an extensively studied paradigm that singles out a consistent repair of the inconsistent data. Consistent query answering (CQA) is an alternative approach to data cleaning that asks for all tuples guaranteed to be returned by a given query on all (in most cases, exponentially many) repairs of the inconsistent data. In this paper, we identify a class of acyclic select-project-join (SPJ) queries for which CQA can be solved via SQL rewriting with a linear time guarantee. Our rewriting method can be viewed as a generalization of Yannakakis' algorithm for acyclic joins to the inconsistent setting. We present LinCQA, a system that takes as input any query in our class and outputs rewritings in both SQL and non-recursive Datalog with negation. We show that LinCQA often outperforms the existing CQA systems on both synthetic and real-world workloads, and in some cases, by orders of magnitude.

show abstract

SparkCruise

et al. 2021

View full text Add to dashboard Cite

Today cloud companies offer fully managed Spark services. This has made it easy to onboard new customers but has also increased the volume of users and their workload sizes. However, both cloud providers and users lack the tools and time to optimize these massive workloads. To solve this problem, we designed SparkCruise that can help understand and optimize workload instances by adding a workload-driven feedback loop to the Spark query optimizer. In this paper, we present our approach to collecting and representing Spark query workloads and use it to improve the overall performance on the workload, all without requiring any access to user data. These methods scale with the number of workloads and apply learned feedback in an online fashion. We explain one specific workload optimization developed for computation reuse. We also share the detailed analysis of production Spark workloads and contrast them with the corresponding analysis of TPC-DS benchmark. To the best of our knowledge, this is the first study to share the analysis of large-scale production Spark SQL workloads.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Xiating Ouyang

Consistent Query Answering for Primary Keys on Path Queries

Unit interval vertex deletion: Fewer vertices are relevant

Unit Interval Vertex Deletion: Fewer Vertices are Relevant

LinCQA: Faster Consistent Query Answering with Linear Time Guarantees

SparkCruise

Contact Info

Product

Resources

About