2019
DOI: 10.7717/peerj.6374
|View full text |Cite
|
Sign up to set email alerts
|

Genes of the pig, Sus scrofa, reconstructed with EvidentialGene

Abstract: The pig is a well-studied model animal of biomedical and agricultural importance. Genes of this species, Sus scrofa, are known from experiments and predictions, and collected at the NCBI reference sequence database section. Gene reconstruction from transcribed gene evidence of RNA-seq now can accurately and completely reproduce the biological gene sets of animals and plants. Such a gene set for the pig is reported here, including human orthologs missing from current NCBI and Ensembl reference pig gene sets, ad… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
32
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
8
1

Relationship

0
9

Authors

Journals

citations
Cited by 46 publications
(35 citation statements)
references
References 27 publications
0
32
0
Order By: Relevance
“…All of the de novo assembled transcripts of Pm1119 mapped onto the Pm1119 genome, thereby confirming the completeness of the gene space in the reference. The transcripts from UCR-PA7, Pm1118 and Pm449 that did not map onto the Pm1119 genome were merged using EvidentialGene ( Gilbert, 2013 ), to generate a non-redundant set of protein-CDS. We identified a total of 455 CDS encoding complete proteins that were not present in the Pm1119 reference: 11 of these were shared by two isolates, whereas 195, 98, and 151 were found only in UCR-PA7, Pm1118, and Pm449, respectively ( Supplementary Data S3 and Supplementary Data S1 : Figure S9 ).…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…All of the de novo assembled transcripts of Pm1119 mapped onto the Pm1119 genome, thereby confirming the completeness of the gene space in the reference. The transcripts from UCR-PA7, Pm1118 and Pm449 that did not map onto the Pm1119 genome were merged using EvidentialGene ( Gilbert, 2013 ), to generate a non-redundant set of protein-CDS. We identified a total of 455 CDS encoding complete proteins that were not present in the Pm1119 reference: 11 of these were shared by two isolates, whereas 195, 98, and 151 were found only in UCR-PA7, Pm1118, and Pm449, respectively ( Supplementary Data S3 and Supplementary Data S1 : Figure S9 ).…”
Section: Resultsmentioning
confidence: 99%
“…Transcripts derived from mitochondrial genes, with internal stop codon(s), without a starting methionine or a stop codon were removed. Transcript redundancies were resolved using the tr2aacds program of EvidentialGene ( Gilbert, 2013 ), which selects from clusters of highly similar contigs the “best” representative transcript based on CDS and protein length. The set of non-redundant transcripts absent in Pm1119 was added to the reference transcriptome to compose the Pm.…”
Section: Methodsmentioning
confidence: 99%
“…It allowed us to assign the best protein hit for transcripts (contigs) and also provide taxonomic assignations based on this best hit. Then, redundant sequences were removed with the Evidentialgene pipeline [44], and only the contigs with the best diamond hit to a sequence belonging to a metazoan species were selected. This reduced data set was labeled with ani (e.g., EveBCdTP1_ani; Fig.…”
Section: Methodsmentioning
confidence: 99%
“…The cleaned data for each time point were assembled independently using Trinityrnaseq (v2.6.5) ( Haas et al 2013 ). Six transcriptome assemblies were merged to create the final transcripts using Evigene (v.18-01-2018) ( Gilbert 2013 ). Assembled transcripts were merged for a total of 80,804 representative genes and 108,659 alternative forms.…”
Section: Methodsmentioning
confidence: 99%