Zhengjie Miao scite author profile

Wang

et al. 2020

Online services are interested in solutions to opinion mining, which is the problem of extracting aspects, opinions, and sentiments from text. One method to mine opinions is to leverage the recent success of pre-trained language models which can be fine-tuned to obtain highquality extractions from reviews. However, fine-tuning language models still requires a non-trivial amount of training data.In this paper, we study the problem of how to significantly reduce the amount of labeled training data required in fine-tuning language models for opinion mining. We describe Snippext, an opinion mining system developed over a language model that is fine-tuned through semi-supervised learning with augmented data. A novelty of Snippext is its clever use of a two-prong approach to achieve state-of-the-art (SOTA) performance with little labeled training data through: (1) data augmentation to automatically generate more labeled training data from existing ones, and (2) a semi-supervised learning technique to leverage the massive amount of unlabeled data in addition to the (limited amount of) labeled data. We show with extensive experiments that Snippext performs comparably and can even exceed previous SOTA results on several opinion mining tasks with only half the training data required. Furthermore, it achieves new SOTA results when all training data are leveraged. By comparison to a baseline pipeline, we found that Snippext extracts significantly more fine-grained opinions which enable new opportunities of downstream applications. ACM Reference Format:

Explaining Wrong Queries Using Small Examples

Roy

Yang

2019

For testing the correctness of SQL queries, e.g., evaluating student submissions in a database course, a standard practice is to execute the query in question on some test database instance and compare its result with that of the correct query. Given two queries Q1 and Q2, we say that a database instance D is a counterexample (for Q1 and Q2) if Q1(D) differs from Q2(D); such a counterexample can serve as an explanation of why Q1 and Q2 are not equivalent. While the test database instance may serve as a counterexample, it may be too large or complex to read and understand where the inequivalence comes from. Therefore, in this paper, given a known counterexample D for Q1 and Q2, we aim to find the smallest counterexample D′ ⊆ D where Q1(D′) ≠ Q2(D′). The problem in general is NP-hard. We give a suite of algorithms for finding the smallest counterexample for different classes of queries, some more tractable than others. We also present an efficient provenance-based algorithm for SPJUD queries that uses a constraint solver, and extend it to more complex queries with aggregation, group-by, and nested queries. We perform extensive experiments indicating the effectiveness and scalability of our solution on student queries from an undergraduate database course and on queries from the TPC-H benchmark. We also report a user study from the course where we deployed our tool to help students with an assignment on relational algebra.

Going Beyond Provenance

Zeng

Glavic

et al. 2019

Cape

Zeng²,

Li³

et al. 2019

Proc. VLDB Endow.

In this demonstration we showcase Cape, a system that explains surprising aggregation outcomes. In contrast to previous work, which relies exclusively on provenance, Cape explains outliers in aggregation queries through related outliers in the opposite direction that provide counterbalance . The foundation of our approach are aggregate regression patterns (ARPs) that describe coarse-grained trends in the data. We define outliers as deviations from such patterns and present an efficient algorithm to find counterbalances explaining outliers. In the demonstration, the audience can run aggregation queries over real world datasets, identify outliers of interest in the result of such queries, and browse the patterns and explanations returned by Cape.

Polyaniline-Based Rose-like Chiral Nanostructures for Raman Enhancement

Zhou

Ren

et al. 2022

ACS Appl. Nano Mater.

Metal-free materials for efficiently enhancing Raman spectrum represent a Frontier in the field of Raman enhancement. Herein, a metal-free Raman enhancement-active material, polyaniline (PANI)-based rose-like chiral nanostructures, which is fabricated by the oxidation polymerization of aniline occurred in a reaction droplet on a glass slide, is developed. The morphology of the as-prepared PANI nanostructures is well designed to rose-like chiral (left-handed or right-handed) nanostructures. Due to the suitable energy level and chiral nanostructure feature, the prepared PANI nanostructures exhibit excellent Raman enhancement performance compared with other metal-free materials. The enhancement factors of the chiral nanostructures are 10 times more than that of achiral nanostructures for dye molecules (e.g., 22.5 times for crystal violet), indicative of significance of the chiral nanostructure to the macroscopic performance. Moreover, the superior reproducibility and generality of the PANI-based rose-like chiral nanostructures illustrate its high potential in Raman enhancement applications. This work not only provides an efficient metal-free platform for Raman enhancement but also develops herein the idea, which is highly instructive for the rational design of advanced Raman enhancement-active materials for broader applications.