Zhenghua Li scite author profile

The qkI gene encodes an RNA binding protein which was identified as a candidate for the classical neurologic mutation, qk v . Although qkI is involved in glial cell differentiation in mice, qkI homologues in other species play important roles in various developmental processes. Here, we show a novel function of qkI in smooth muscle cell differentiation during embryonic blood vessel formation. qkI null embryos died between embryonic day 9.5 and 10.5. Embryonic day 9.5 qkI null embryos showed a lack of large vitelline vessels in the yolk sacs, kinky neural tubes, pericardial effusion, open neural tubes and incomplete embryonic turning. Using X-gal and immunohistochemical staining, qkI is first shown to be expressed in endothelial cells and smooth muscle cells. Analyses of qkI null embryos in vivo and in vitro revealed that the vitelline artery was too thin to connect properly to the yolk sac, thereby preventing remodeling of the yolk sac vasculature, and that the vitelline vessel was deficient in smooth muscle cells. Addition of QKI and platelet-endothelial cell adhesion molecule-1 positive cells to an in vitro para-aortic splanchnopleural culture of qkI null embryos rescued the vascular remodeling deficit. These data suggest that QKI protein has a critical regulatory role in smooth muscle cell development, and that smooth muscle cells play an important role in inducing vascular remodeling.

show abstract

Efficient Second-Order TreeCRF for Neural Dependency Parsing

Zhang¹,

Li²,

Zhang³

2020

View full text Add to dashboard Cite

In the deep learning (DL) era, parsing models are extremely simplified with little hurt on performance, thanks to the remarkable capability of multi-layer BiLSTMs in context representation. As the most popular graphbased dependency parser due to its high efficiency and performance, the biaffine parser directly scores single dependencies under the arc-factorization assumption, and adopts a very simple local token-wise cross-entropy training loss. This paper for the first time presents a second-order TreeCRF extension to the biaffine parser. For a long time, the complexity and inefficiency of the inside-outside algorithm hinder the popularity of TreeCRF. To address this issue, we propose an effective way to batchify the inside and Viterbi algorithms for direct large matrix operation on GPUs, and to avoid the complex outside algorithm via efficient back-propagation. Experiments and analysis on 27 datasets from 13 languages clearly show that techniques developed before the DL era, such as structural learning (global TreeCRF loss) and high-order modeling are still useful, and can further boost parsing performance over the state-of-the-art biaffine parser, especially for partially annotated training data. We release our code at https:

show abstract

Gut microbiota changes in patients with autism spectrum disorders

Ding

Zhang

et al. 2020

Journal of Psychiatric Research

View full text Add to dashboard Cite

Coupled Sequence Labeling on Heterogeneous Annotations: POS Tagging as a Case Study

Chao

Zhang

et al. 2015

View full text Add to dashboard Cite

In order to effectively utilize multiple datasets with heterogeneous annotations, this paper proposes a coupled sequence labeling model that can directly learn and infer two heterogeneous annotations simultaneously, and to facilitate discussion we use Chinese part-ofspeech (POS) tagging as our case study. The key idea is to bundle two sets of POS tags together (e.g. "[NN, n]"), and build a conditional random field (CRF) based tagging model in the enlarged space of bundled tags with the help of ambiguous labelings. To train our model on two non-overlapping datasets that each has only one-side tags, we transform a one-side tag into a set of bundled tags by considering all possible mappings at the missing side and derive an objective function based on ambiguous labelings. The key advantage of our coupled model is to provide us with the flexibility of 1) incorporating joint features on the bundled tags to implicitly learn the loose mapping between heterogeneous annotations, and 2) exploring separate features on one-side tags to overcome the data sparseness problem of using only bundled tags. Experiments on benchmark datasets show that our coupled model significantly outperforms the state-ofthe-art baselines on both one-side POS tagging and annotation conversion tasks. The codes and newly annotated data are released for non-commercial usage.

show abstract

Viable offspring obtained from Prm1-deficient sperm in mice

Takeda

Yoshinaga

Furushima

et al. 2016

Sci Rep

View full text Add to dashboard Cite

Protamines are expressed in the spermatid nucleus and allow denser packaging of DNA compared with histones. Disruption of the coding sequence of one allele of either protamine 1 (Prm1) or Prm2 results in failure to produce offspring, although sperm with disrupted Prm1 or Prm2 alleles are produced. Here, we produced Prm1-deficient female chimeric mice carrying Prm1-deficient oocytes. These mice successfully produced Prm1+/− male mice. Healthy Prm1+/− offspring were then produced by transferring blastocysts obtained via in vitro fertilization using zona-free oocytes and sperm from Prm1+/− mice. This result suggests that sperm lacking Prm1 can generate offspring despite being abnormally shaped and having destabilised DNA, decondensed chromatin and a reduction in mitochondrial membrane potential. Nevertheless, these mice showed little derangement of expression profiles.

show abstract

Chinese Spelling Error Detection and Correction Based on Language Model, Pronunciation, and Shape

Yu¹,

Li²

2014

View full text Add to dashboard Cite

Spelling check is an important preprocessing task when dealing with user generated texts such as tweets and product comments. Compared with some western languages such as English, Chinese spelling check is more complex because there is no word delimiter in Chinese written texts and misspelled characters can only be determined in word level. Our system works as follows. First, we use character-level n-gram language models to detect potential misspelled characters with low probabilities below some predefined threshold. Second, for each potential incorrect character, we generate a candidate set based on pronunciation and shape similarities. Third, we filter some candidate corrections if the candidate cannot form a legal word with its neighbors according to a word dictionary. Finally, we find the best candidate with highest language model probability. If the probability is higher than a predefined threshold, then we replace the original character; or we consider the original character as correct and take no action. Our preliminary experiments shows that our simple method can achieve relatively high precision but low recall.

show abstract

Joint Optimization for Chinese POS Tagging and Dependency Parsing

Zhang

Che

et al. 2014

IEEE/ACM Trans. Audio Speech Lang. Process.

View full text Add to dashboard Cite

Abstract-Dependency parsing has gained more and more interest in natural language processing in recent years due to its simplicity and general applicability for diverse languages. Previous work demonstrates that part-of-speech (POS) is an indispensable feature in dependency parsing since pure lexical features suffer from serious data sparseness problem. However, due to little morphological changes, Chinese POS tagging has proven to be much more challenging than morphology-richer languages such as English (94% vs. 97% on POS tagging accuracy). This leads to severe error propagation for Chinese dependency parsing. Our experiments show that parsing accuracy drops by about 6% when replacing manual POS tags of the input sentence with automatic ones generated by a state-of-the-art statistical POS tagger. To address this issue, this paper proposes a solution by jointly optimizing POS tagging and dependency parsing in a unique model. We propose for our joint models several dynamic programming based decoding algorithms which can incorporate rich POS tagging and syntactic features. Then we present an effective pruning strategy to reduce the search space of candidate POS tags, leading to significant improvement of parsing speed. Experimental results on two Chinese data sets, i.e. Penn Chinese Treebank 5.1 and Penn Chinese Treebank 7, demonstrate that our joint models significantly improve both the state-of-the-art tagging and parsing accuracies. Detailed analysis shows that the joint method can help resolve syntax-sensitive POS ambiguities . In return, the POS tags become more reliable and helpful for parsing since the syntactic features are used in POS tagging. This is the fundamental reason for the performance improvement.Index Terms-Dependency parsing, dynamic programming, joint models, part-of-speech tagging.

show abstract

Fast and Accurate Neural CRF Constituency Parsing

Zhang¹,

Zhou²,

Li³

2020

View full text Add to dashboard Cite

Estimating probability distribution is one of the core issues in the NLP field. However, in both deep learning (DL) and pre-DL eras, unlike the vast applications of linear-chain CRF in sequence labeling tasks, very few works have applied tree-structure CRF to constituency parsing, mainly due to the complexity and inefficiency of the inside-outside algorithm. This work presents a fast and accurate neural CRF constituency parser. The key idea is to batchify the inside algorithm for loss computation by direct large tensor operations on GPU, and meanwhile avoid the outside algorithm for gradient computation via efficient back-propagation. We also propose a simple two-stage bracketing-then-labeling parsing approach to improve efficiency further. To improve the parsing performance, inspired by recent progress in dependency parsing, we introduce a new scoring architecture based on boundary representation and biaffine attention, and a beneficial dropout strategy. Experiments on PTB, CTB5.1, and CTB7 show that our two-stage CRF parser achieves new state-of-the-art performance on both settings of w/o and w/ BERT, and can parse over 1,000 sentences per second. We release our code at https://github.com/yzhangcs/crfpar.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

334 Leonard St

Brooklyn, NY 11211

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Zhenghua Li

Defective smooth muscle development in qkI‐deficient mice

Efficient Second-Order TreeCRF for Neural Dependency Parsing

Gut microbiota changes in patients with autism spectrum disorders

Coupled Sequence Labeling on Heterogeneous Annotations: POS Tagging as a Case Study

Viable offspring obtained from Prm1-deficient sperm in mice

Chinese Spelling Error Detection and Correction Based on Language Model, Pronunciation, and Shape

Joint Optimization for Chinese POS Tagging and Dependency Parsing

Fast and Accurate Neural CRF Constituency Parsing

Contact Info

Product

Resources

About