We present a novel iterative, edit-based approach to unsupervised sentence simplification. Our model is guided by a scoring function involving fluency, simplicity, and meaning preservation. Then, we iteratively perform word and phrase-level edits on the complex sentence. Compared with previous approaches, our model does not require a parallel training set, but is more controllable and interpretable. Experiments on Newsela and WikiLarge datasets show that our approach is nearly as effective as state-of-the-art supervised approaches. 1
We introduce GEM, a living benchmark for natural language Generation (NLG), its Evaluation, and Metrics. Measuring progress in NLG relies on a constantly evolving ecosystem of automated metrics, datasets, and human evaluation standards. Due to this moving target, new models often still evaluate on divergent anglo-centric corpora with wellestablished, but flawed, metrics. This disconnect makes it challenging to identify the limitations of current models and opportunities for progress. Addressing this limitation, GEM provides an environment in which models can easily be applied to a wide set of tasks and in which evaluation strategies can be tested. Regular updates to the benchmark will help NLG research become more multilingual and evolve the challenge alongside models. This paper serves as the description of the data for which we are organizing a shared task at our ACL 2021 Workshop and to which we invite the entire NLG community to participate.
WRKY, a plant-specific transcription factor family, has important roles in pathogen defense, abiotic cues and phytohormone signaling, yet little is known about their roles and molecular mechanism of function in response to rust diseases in wheat. We identified 100 TaWRKY sequences using wheat Expressed Sequence Tag database of which 22 WRKY sequences were novel. Identified proteins were characterized based on their zinc finger motifs and phylogenetic analysis clustered them into six clades consisting of class IIc and class III WRKY proteins. Functional annotation revealed major functions in metabolic and cellular processes in control plants; whereas response to stimuli, signaling and defense in pathogen inoculated plants, their major molecular function being binding to DNA. Tag-based expression analysis of the identified genes revealed differential expression between mock and Puccinia triticina inoculated wheat near isogenic lines. Gene expression was also performed with six rust-related microarray experiments at Gene Expression Omnibus database. TaWRKY10, 15, 17 and 56 were common in both tag-based and microarray-based differential expression analysis and could be representing rust specific WRKY genes. The obtained results will bestow insight into the functional characterization of WRKY transcription factors responsive to leaf rust pathogenesis that can be used as candidate genes in molecular breeding programs to improve biotic stress tolerance in wheat.
MicroRNAs are endogenous small noncoding RNAs which play critical roles in gene regulation. Few wheat (Triticum aestivum L.) miRNA sequences are available in miRBase repertoire and knowledge of their biological functions related to biotic stress is limited. We identified 52 miRNAs, belonging to 19 families, from next-generation transcriptome sequence data based on homology search. One wheat specific novel miRNA was identified but could not be ascribed or assigned to any known miRNA family. Differentially expressed 22 miRNAs were found between susceptible and resistant wheat near-isogenic lines inoculated with leaf rust pathogen Puccinia triticina and compared with mock inoculated controls. Most miRNAs were more upregulated in susceptible NIL compared to resistant NIL. We identified 1306 potential target genes for these 52 miRNAs with vital roles in response to stimuli, signaling, and diverse metabolic and cellular processes. Gene ontology analysis showed 66, 20, and 35 target genes to be categorized into biological process, molecular function, and cellular component, respectively. A miRNA-mediated regulatory network revealed relationships among the components of the targetome. The present study provides insight into potential miRNAs with probable roles in leaf rust pathogenesis and their target genes in wheat which establish a foundation for future studies.
Study on expression of genes for the traits associated with hypoxia tolerance during the germination demands robust choice of reference genes for transcript data normalization and gene validation through real-time quantitative polymerase chain reaction (RT-qPCR). However, reliability and stability of reference genes across different rice germplasms under hypoxic condition have not been accessed yet. Stability performance of reference genes such as eukaryotic elongation factor 1 α (eEF1α), ubiquitin 10 (UBQ10), glyceraldehyde 3-phosphate dehydrogenase (GAPDH), 18S ribosomal RNA (18SrRNA), 25S ribosomal RNA (25SrRNA), β-tublin (β-TUB), actin11 (ACT11), ubiquitin C (UBC), eukaryotic elongation factor 4 α (eIF4α), and ubiquitin5 (UBQ5) was accessed through statistical algorithms like geNorm, NormFinder, Comparative ΔCt method BestKeeper, and RefFinder in three rice germplasms (KHO, RKB, and IR-64) with varied level of tolerance to hypoxic condition during germination. Among all genes used, OsGAPDH was found to be the most suitable reference gene under hypoxic condition. The performance of the highest-ranking reference gene (OsGAPDH) in terms of stability based on statistical algorithms was further validated for its reliability and stability through RT-qPCR with hypoxia-induced target gene OsTTP7. The identified stable housekeeping gene could be used as internal control for gene expression analysis in rice under hypoxia.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.