2018
DOI: 10.1371/journal.pone.0200323
|View full text |Cite
|
Sign up to set email alerts
|

Consensus assessment of the contamination level of publicly available cyanobacterial genomes

Abstract: Publicly available genomes are crucial for phylogenetic and metagenomic studies, in which contaminating sequences can be the cause of major problems. This issue is expected to be especially important for Cyanobacteria because axenic strains are notoriously difficult to obtain and keep in culture. Yet, despite their great scientific interest, no data are currently available concerning the quality of publicly available cyanobacterial genomes. As reliably detecting contaminants is a complex task, we designed a pi… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

2
67
0
1

Year Published

2018
2018
2024
2024

Publication Types

Select...
5
3
1

Relationship

2
7

Authors

Journals

citations
Cited by 48 publications
(70 citation statements)
references
References 80 publications
2
67
0
1
Order By: Relevance
“…A recent assessment of 67758 publically-available Salmonella sequences determined that 1.87% of samples had cross-species contamination based on a read classification approach (Robertson et al, 2018). Prevalence of cross-species sequence contamination in public repositories is a known issue that has been described in a number of studies (Merchant et al, 2014;Mukherjee et al, 2015;Lee et al, 2017;Cornet et al, 2018). Very few studies have looked at intraspecies contamination in public repositories, and we could not identify any studies evaluating prevalence of intraspecies contamination in foodborne pathogens.…”
Section: Discussionmentioning
confidence: 96%
See 1 more Smart Citation
“…A recent assessment of 67758 publically-available Salmonella sequences determined that 1.87% of samples had cross-species contamination based on a read classification approach (Robertson et al, 2018). Prevalence of cross-species sequence contamination in public repositories is a known issue that has been described in a number of studies (Merchant et al, 2014;Mukherjee et al, 2015;Lee et al, 2017;Cornet et al, 2018). Very few studies have looked at intraspecies contamination in public repositories, and we could not identify any studies evaluating prevalence of intraspecies contamination in foodborne pathogens.…”
Section: Discussionmentioning
confidence: 96%
“…The presence of contamination in WGS data is recognized as an important sequence quality issue (Merchant et al, 2014;Ballenghien et al, 2017;Robertson et al, 2018;Cornet et al, 2018). Introduction of contaminants can occur at many stages in the generation of bacterial sequence data.…”
Section: Introductionmentioning
confidence: 99%
“…Strains can be mislabelled, lose morphological or biochemical traits or harbour contaminants present at isolation (e.g for cyanobacteria with significant sheaths), with implications for experimental work - as a result, axenic strains are very difficult to obtain [53]. According to the study of Cornet et al [54] on contamination level of publicly accessible cyanobacterial genomes, 21 out of 440 surveyed genomes were highly contaminated (mostly with Proteobacteria and Bacteroidetes). In our study we used primers preferentially amplifying cyanobacterial 16S [47,48], however, some strains failed to sequence due to overlapping heterogeneous signal in Sanger sequencing, which can be indicative of contamination.…”
Section: Discussionmentioning
confidence: 99%
“…The absence of rRNA genes is explained by the fact that the majority of these genomes (143 out of 193) were assembled with a metagenomic pipeline, according to NCBI metadata. This phenomenon was noticed in Cornet et al (2018a) [38] and is due to the frequent loss of rRNA sequences during metagenomic assembly [28]. Given that neither the isolation source nor the assembly pipeline were easy to determine, we decided not to consider these 193 genomes in our analysis by precautionary principle.…”
Section: Limitationsmentioning
confidence: 96%