The qkI gene encodes an RNA binding protein which was identified as a candidate for the classical neurologic mutation, qk v . Although qkI is involved in glial cell differentiation in mice, qkI homologues in other species play important roles in various developmental processes. Here, we show a novel function of qkI in smooth muscle cell differentiation during embryonic blood vessel formation. qkI null embryos died between embryonic day 9.5 and 10.5. Embryonic day 9.5 qkI null embryos showed a lack of large vitelline vessels in the yolk sacs, kinky neural tubes, pericardial effusion, open neural tubes and incomplete embryonic turning. Using X-gal and immunohistochemical staining, qkI is first shown to be expressed in endothelial cells and smooth muscle cells. Analyses of qkI null embryos in vivo and in vitro revealed that the vitelline artery was too thin to connect properly to the yolk sac, thereby preventing remodeling of the yolk sac vasculature, and that the vitelline vessel was deficient in smooth muscle cells. Addition of QKI and platelet-endothelial cell adhesion molecule-1 positive cells to an in vitro para-aortic splanchnopleural culture of qkI null embryos rescued the vascular remodeling deficit. These data suggest that QKI protein has a critical regulatory role in smooth muscle cell development, and that smooth muscle cells play an important role in inducing vascular remodeling.
In the deep learning (DL) era, parsing models are extremely simplified with little hurt on performance, thanks to the remarkable capability of multi-layer BiLSTMs in context representation. As the most popular graphbased dependency parser due to its high efficiency and performance, the biaffine parser directly scores single dependencies under the arc-factorization assumption, and adopts a very simple local token-wise cross-entropy training loss. This paper for the first time presents a second-order TreeCRF extension to the biaffine parser. For a long time, the complexity and inefficiency of the inside-outside algorithm hinder the popularity of TreeCRF. To address this issue, we propose an effective way to batchify the inside and Viterbi algorithms for direct large matrix operation on GPUs, and to avoid the complex outside algorithm via efficient back-propagation. Experiments and analysis on 27 datasets from 13 languages clearly show that techniques developed before the DL era, such as structural learning (global TreeCRF loss) and high-order modeling are still useful, and can further boost parsing performance over the state-of-the-art biaffine parser, especially for partially annotated training data. We release our code at https:
In order to effectively utilize multiple datasets with heterogeneous annotations, this paper proposes a coupled sequence labeling model that can directly learn and infer two heterogeneous annotations simultaneously, and to facilitate discussion we use Chinese part-ofspeech (POS) tagging as our case study. The key idea is to bundle two sets of POS tags together (e.g. "[NN, n]"), and build a conditional random field (CRF) based tagging model in the enlarged space of bundled tags with the help of ambiguous labelings. To train our model on two non-overlapping datasets that each has only one-side tags, we transform a one-side tag into a set of bundled tags by considering all possible mappings at the missing side and derive an objective function based on ambiguous labelings. The key advantage of our coupled model is to provide us with the flexibility of 1) incorporating joint features on the bundled tags to implicitly learn the loose mapping between heterogeneous annotations, and 2) exploring separate features on one-side tags to overcome the data sparseness problem of using only bundled tags. Experiments on benchmark datasets show that our coupled model significantly outperforms the state-ofthe-art baselines on both one-side POS tagging and annotation conversion tasks. The codes and newly annotated data are released for non-commercial usage.
Protamines are expressed in the spermatid nucleus and allow denser packaging of DNA compared with histones. Disruption of the coding sequence of one allele of either protamine 1 (Prm1) or Prm2 results in failure to produce offspring, although sperm with disrupted Prm1 or Prm2 alleles are produced. Here, we produced Prm1-deficient female chimeric mice carrying Prm1-deficient oocytes. These mice successfully produced Prm1+/− male mice. Healthy Prm1+/− offspring were then produced by transferring blastocysts obtained via in vitro fertilization using zona-free oocytes and sperm from Prm1+/− mice. This result suggests that sperm lacking Prm1 can generate offspring despite being abnormally shaped and having destabilised DNA, decondensed chromatin and a reduction in mitochondrial membrane potential. Nevertheless, these mice showed little derangement of expression profiles.
Spelling check is an important preprocessing task when dealing with user generated texts such as tweets and product comments. Compared with some western languages such as English, Chinese spelling check is more complex because there is no word delimiter in Chinese written texts and misspelled characters can only be determined in word level. Our system works as follows. First, we use character-level n-gram language models to detect potential misspelled characters with low probabilities below some predefined threshold. Second, for each potential incorrect character, we generate a candidate set based on pronunciation and shape similarities. Third, we filter some candidate corrections if the candidate cannot form a legal word with its neighbors according to a word dictionary. Finally, we find the best candidate with highest language model probability. If the probability is higher than a predefined threshold, then we replace the original character; or we consider the original character as correct and take no action. Our preliminary experiments shows that our simple method can achieve relatively high precision but low recall.
Abstract-Dependency parsing has gained more and more interest in natural language processing in recent years due to its simplicity and general applicability for diverse languages. Previous work demonstrates that part-of-speech (POS) is an indispensable feature in dependency parsing since pure lexical features suffer from serious data sparseness problem. However, due to little morphological changes, Chinese POS tagging has proven to be much more challenging than morphology-richer languages such as English (94% vs. 97% on POS tagging accuracy). This leads to severe error propagation for Chinese dependency parsing. Our experiments show that parsing accuracy drops by about 6% when replacing manual POS tags of the input sentence with automatic ones generated by a state-of-the-art statistical POS tagger. To address this issue, this paper proposes a solution by jointly optimizing POS tagging and dependency parsing in a unique model. We propose for our joint models several dynamic programming based decoding algorithms which can incorporate rich POS tagging and syntactic features. Then we present an effective pruning strategy to reduce the search space of candidate POS tags, leading to significant improvement of parsing speed. Experimental results on two Chinese data sets, i.e. Penn Chinese Treebank 5.1 and Penn Chinese Treebank 7, demonstrate that our joint models significantly improve both the state-of-the-art tagging and parsing accuracies. Detailed analysis shows that the joint method can help resolve syntax-sensitive POS ambiguities . In return, the POS tags become more reliable and helpful for parsing since the syntactic features are used in POS tagging. This is the fundamental reason for the performance improvement.Index Terms-Dependency parsing, dynamic programming, joint models, part-of-speech tagging.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.