2014
DOI: 10.1186/1471-2164-15-s8-s1
|View full text |Cite
|
Sign up to set email alerts
|

Impact of analytic provenance in genome analysis

Abstract: BackgroundMany computational methods are available for assembly and annotation of newly sequenced microbial genomes. However, when new genomes are reported in the literature, there is frequently very little critical analysis of choices made during the sequence assembly and gene annotation stages. These choices have a direct impact on the biologically relevant products of a genomic analysis - for instance identification of common and differentiating regions among genomes in a comparison, or identification of en… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
5
0

Year Published

2015
2015
2022
2022

Publication Types

Select...
5
4
1

Relationship

0
10

Authors

Journals

citations
Cited by 14 publications
(5 citation statements)
references
References 41 publications
0
5
0
Order By: Relevance
“…Unlike the majority of other birds analyzed so far, kiwi has a higher number of γ subgroup ORs. Gene family size estimates are highly dependent on genome quality [ 46 ] and continuous curation is ongoing even for well-annotated genomes: for example, in the chicken olfactory repertoire the number of annotated ORs changed by a factor of eight in two consecutive Ensembl releases (release 73 – 251 ORs and release 74 – 30 ORs). Further improvement of genome qualities, including kiwi, are therefore required for the identification of a complete set of ORs.…”
Section: Resultsmentioning
confidence: 99%
“…Unlike the majority of other birds analyzed so far, kiwi has a higher number of γ subgroup ORs. Gene family size estimates are highly dependent on genome quality [ 46 ] and continuous curation is ongoing even for well-annotated genomes: for example, in the chicken olfactory repertoire the number of annotated ORs changed by a factor of eight in two consecutive Ensembl releases (release 73 – 251 ORs and release 74 – 30 ORs). Further improvement of genome qualities, including kiwi, are therefore required for the identification of a complete set of ORs.…”
Section: Resultsmentioning
confidence: 99%
“…A genome sequence dataset may be recreated from raw data and the provenance records associated with genomic annotations [15].…”
Section: Next Generation Data Science Challengesmentioning
confidence: 99%
“…Several recent investigations have reported factors that may influence the accuracy of plastid genome assembly, including software choice (Freudenthal et al, 2020 ) and sequencing coverage (reviewed in Gruenstaeudl and Jenke, 2020 ). The choice of assembly software has been reported as a source of inconsistency in genome assembly by several previous studies (e.g., Magoc et al, 2013 ; Morrison et al, 2014 ). In the de novo assembly of plastid genomes from genome skimming data, such inconsistency may be associated with differences between assembly algorithms: while some software tools have implemented algorithms that conduct a cyclical sequence extension from a single “seed” sequence (e.g., Dierckxsens et al, 2017 ), others employ a kmer-based construction of contigs, followed by the concatenation of multiple contigs based on sequence overlap and similarity to a reference genome (e.g., Bakker et al, 2016 ; McKain and Wilson, 2017 ).…”
Section: Introductionmentioning
confidence: 99%