Taifeng Wang scite author profile

Chinese Spelling Check (CSC) is a task to detect and correct spelling errors in Chinese natural language. Existing methods have made attempts to incorporate the similarity knowledge between Chinese characters. However, they take the similarity knowledge as either an external input resource or just heuristic rules. This paper proposes to incorporate phonological and visual similarity knowledge into language models for CSC via a specialized graph convolutional network (SpellGCN). The model builds a graph over the characters, and SpellGCN is learned to map this graph into a set of inter-dependent character classifiers. These classifiers are applied to the representations extracted by another network, such as BERT, enabling the whole network to be end-to-end trainable. Experiments 1 are conducted on three human-annotated datasets. Our method achieves superior performance against previous models by a large margin.

show abstract

Question Directed Graph Attention Network for Numerical Reasoning over Text

Chen¹,

Xu²,

Cheng³

et al. 2020

View full text Add to dashboard Cite

Numerical reasoning over texts, such as addition, subtraction, sorting and counting, is a challenging machine reading comprehension task, since it requires both natural language understanding and arithmetic computation. To address this challenge, we propose a heterogeneous graph representation for the context of the passage and question needed for such reasoning, and design a question directed graph attention network to drive multi-step numerical reasoning over this context graph. Our model, which combines deep learning and graph reasoning, achieves remarkable results in benchmark datasets such as DROP 1 . * Corresponding author 1 https://leaderboard.allenai.org/drop/submissions/public. As of September 08, 2020, our models are ranked first in the case of fair comparison using the identical pre-training model.

show abstract

Large-Scale Low-Rank Matrix Learning with Nonconvex Regularizers

Yao¹,

Kwok

Wang

et al. 2019

IEEE Trans. Pattern Anal. Mach. Intell.

View full text Add to dashboard Cite

Low-rank modeling has many important applications in computer vision and machine learning. While the matrix rank is often approximated by the convex nuclear norm, the use of nonconvex low-rank regularizers has demonstrated better empirical performance. However, the resulting optimization problem is much more challenging. Recent state-of-the-art requires an expensive full SVD in each iteration. In this paper, we show that for many commonly-used nonconvex low-rank regularizers, the singular values obtained from the proximal operator can be automatically threshold. This allows the proximal operator to be efficiently approximated by the power method. We then develop a fast proximal algorithm and its accelerated variant with inexact proximal step. It can be guaranteed that the squared distance between consecutive iterates converges at a rate of , where is the number of iterations. Furthermore, we show the proposed algorithm can be parallelized, and the resultant algorithm achieves nearly linear speedup w.r.t. the number of threads. Extensive experiments are performed on matrix completion and robust principal component analysis. Significant speedup over the state-of-the-art is observed.

show abstract

Relational click prediction for sponsored search

Xiong

Wang

Ding

et al. 2012

View full text Add to dashboard Cite

This paper is concerned with the prediction of clicking an ad in sponsored search. The accurate prediction of user's click on an ad plays an important role in sponsored search, because it is widely used in both ranking and pricing of the ads. Previous work on click prediction usually takes a single ad as input, and ignores its relationship to the other ads shown in the same page. This independence assumption here, however, might not be valid in the real scenario. In this paper, we first perform an analysis on this issue by looking at the click-through rates (CTR) of the same ad, in the same position and for the same query, but surrounded by different ads. We found that in most cases the CTR varies largely, which suggests that the relationship between ads is really an important factor in predicting click probability. Furthermore, our investigation shows that the more similar the surrounding ads are to an ad, the lower the CTR of the ad is. Based on this observation, we design a continuous conditional random fields (CRF) based model for click prediction, which considers both the features of an ad and its similarity to the surrounding ads. We show that the model can be effectively learned using maximum likelihood estimation, and can also be efficiently inferred due to its closed form solution. Our experimental results on the click-through log from a commercial search engine show that the proposed model can predict clicks more accurately than previous independent models. To our best knowledge this is the first work that predicts ad clicks by considering the relationship between ads.

show abstract

Document-level Event Extraction via Parallel Prediction Networks

Yang¹,

Sui²,

Chen³

et al. 2021

View full text Add to dashboard Cite

Document-level event extraction (DEE) is indispensable when events are described throughout a document.We argue that sentence-level extractors are ill-suited to the DEE task where event arguments always scatter across sentences and multiple events may co-exist in a document. It is a challenging task because it requires a holistic understanding of the document and an aggregated ability to assemble arguments across multiple sentences. In this paper, we propose an end-to-end model, which can extract structured events from a document in a parallel manner. Specifically, we first introduce a document-level encoder to obtain the document-aware representations. Then, a multi-granularity non-autoregressive decoder is used to generate events in parallel. Finally, to train the entire model, a matching loss function is proposed, which can bootstrap a global optimization. The empirical results on the widely used DEE dataset show that our approach significantly outperforms current stateof-the-art methods in the challenging DEE task. Code will be available at https:// github.com/HangYang-NLP/DE-PPN.1 https://www.ldc.upenn.edu/ collaborations/past-projects/ace[S3] On November 1, 2018, Shenzhen 007 Co., Ltd. received a notice that the corporate shareholder Shanghai Fukong Co., Ltd and the actual controller Jing Yan were judicial frozen.[S7] The corporate shareholder holds 150000 shares of the company. The 10000 shares were frozen by the Shenzhen Intermediate Peoples Court from October 30, 2018 to October 30, 2019.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Taifeng Wang

SpellGCN: Incorporating Phonological and Visual Similarities into Language Models for Chinese Spelling Check

Question Directed Graph Attention Network for Numerical Reasoning over Text

Large-Scale Low-Rank Matrix Learning with Nonconvex Regularizers

Relational click prediction for sponsored search

Document-level Event Extraction via Parallel Prediction Networks

Contact Info

Product

Resources

About