Protein-coding gene annotation. To search for homologous genes, the protein sequences from all ferns and lycophytes transcriptomes in the OneKP project 1 were retrieved and aligned to the A. capillus-veneris genome, using GeneWise 2 . For transcriptome-based prediction, nineteen transcriptomes covering the entire life cycle of A. capillus-veneris were generated in this study (Supplementary Table 8). RNA was extracted using the Qiagen RNeasy protocol and sequenced on an Illumina HiSeq 4000 with a 300 bp insert size. For transcriptome-based prediction, the HISAT2 3 and StringTie 4 programs were used for transcript assembly 5 . The program PASA (http://pasapipeline.github.io) was used to align spliced transcripts and annotate candidate genes. Ab initio prediction was performed with AUGUSTUS 6 , GlimmerHMM 7 , and SNAP 8 . Finally, nonredundant gene models were obtained with EVidenceModeler (version 1.1.0) 9 to integrate the gene models developed by different datasets.To validate the assembly quality, RNA-seq reads from nineteen tissues (Supplementary Table 8), together with publicly available EST sequences from the NCBI database (downloaded from http://togodb.dbcls.jp/library), were mapped to the A. capillus-veneris genome using HISAT2 3 and BLAT 10 with default parameters, respectively. The BLAT results were filtered with an identity and coverage cutoff of 0.9.Identification of noncoding RNAs. We used tRNAscan-SE (version 2.0rc2) 11 , with default parameters, to search for tRNAs in the A. capillus-veneris genome. A total of 1,624 tRNAs were found. Moreover, the Rfam14.0 database 12 , including 3,445 noncoding RNA families, was used to annotate additional noncoding RNAs (ncRNAs), including miRNAs, snRNAs, and tRNAs, using INFERNAL (version 1.1.2) 13 program.We predicted rRNA (5S, 5.8S, 28S, 18S) by using HMM searching based rRNA predicator Barrnap (version 0.9, https://github.com/tseemann/barrnap#barrnap), with default parameters. We finally identified 145 5S, 75 5.8S, 155 28S, and 165 18S sequences and their locations within the genome assembly of A. capillus-veneris.
Key message We re-annotated repeats of 459 plant genomes and released a new database: PlantRep (http://www.plantrep.cn/). PlantRep sheds lights of repeat evolution and provides fundamental data for deep exploration of genome.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.