Heiner Klingenberg scite author profile

Critical Assessment of Metagenome Interpretation—a benchmark of metagenomics software

Sczyrba

¹

,

Hofmann

²

,

Belmann

³

et al. 2017

View full text Add to dashboard Cite

In metagenome analysis, computational methods for assembly, taxonomic profiling and binning are key components facilitating downstream biological data interpretation. However, a lack of consensus about benchmarking datasets and evaluation metrics complicates proper performance assessment. The Critical Assessment of Metagenome Interpretation (CAMI) challenge has engaged the global developer community to benchmark their programs on datasets of unprecedented complexity and realism. Benchmark metagenomes were generated from ~700 newly sequenced microorganisms and ~600 novel viruses and plasmids, including genomes with varying degrees of relatedness to each other and to publicly available ones and representing common experimental setups. Across all datasets, assembly and genome binning programs performed well for species represented by individual genomes, while performance was substantially affected by the presence of related strains. Taxonomic profiling and binning programs were proficient at high taxonomic ranks, with a notable performance decrease below the family level. Parameter settings substantially impacted performances, underscoring the importance of program reproducibility. While highlighting current challenges in computational metagenomics, the CAMI results provide a roadmap for software selection to answer specific research questions.

show abstract

Critical Assessment of Metagenome Interpretation – a benchmark of computational metagenomics software

Sczyrba¹,

Hofmann²,

Belmann³

et al. 2017

Preprint

View full text Add to dashboard Cite

In metagenome analysis, computational methods for assembly, taxonomic profiling and binning are key components facilitating downstream biological data interpretation. However, a lack of consensus about benchmarking datasets and evaluation metrics complicates proper performance assessment. The Critical Assessment of Metagenome Interpretation (CAMI) challenge has engaged the global developer community to benchmark their programs on datasets of unprecedented complexity and realism. Benchmark metagenomes were generated from ~700 newly sequenced microorganisms and ~600 novel viruses and plasmids, including genomes with varying degrees of relatedness to each other and to publicly available ones and representing common experimental setups. Across all datasets, assembly and genome binning programs performed well for species represented by individual genomes, while performance was substantially affected by the presence of related strains. Taxonomic profiling and binning programs were proficient at high taxonomic ranks, with a notable performance decrease below the family level. Parameter settings substantially impacted performances, underscoring the importance of program reproducibility. While highlighting current challenges in computational metagenomics, the CAMI results provide a roadmap for software selection to answer specific research questions.. CC-BY 4.0 International license peer-reviewed) is the author/funder. It is made available under a

show abstract

Visual attention to variation in female facial skin color distribution

Fink

¹

,

Matts²,

Klingenberg

³

et al. 2008

J of Cosmetic Dermatology

View full text Add to dashboard Cite

show abstract

Protein signature-based estimation of metagenomic abundances including all domains of life and viruses

Klingenberg

¹

,

Aßhauer

²

,

Lingner

³

et al. 2013

View full text Add to dashboard Cite

Motivation: Metagenome analysis requires tools that can estimate the taxonomic abundances in anonymous sequence data over the whole range of biological entities. Because there is usually no prior knowledge about the data composition, not only all domains of life but also viruses have to be included in taxonomic profiling. Such a full-range approach, however, is difficult to realize owing to the limited coverage of available reference data. In particular, archaea and viruses are generally not well represented by current genome databases.Results: We introduce a novel approach to taxonomic profiling of metagenomes that is based on mixture model analysis of protein signatures. Our results on simulated and real data reveal the difficulties of the existing methods when measuring achaeal or viral abundances and show the overall good profiling performance of the protein-based mixture model. As an application example, we provide a large-scale analysis of data from the Human Microbiome Project. This demonstrates the utility of our method as a first instance profiling tool for a fast estimate of the community structure.Availability: http://gobics.de/TaxyPro.Contact: pmeinic@gwdg.deSupplementary information: Supplementary Material is available at Bioinformatics online.

show abstract

Identification of New Fungal Peroxisomal Matrix Proteins and Revision of the PTS1 Consensus

Nötzel

¹

,

Lingner

²

,

Klingenberg

³

et al. 2016

Traffic

View full text Add to dashboard Cite

The peroxisomal targeting signal type 1 (PTS1) is a seemingly simple peptide sequence at the C-terminal end of most peroxisomal matrix proteins. PTS1 can be described as a tripeptide with the consensus motifHowever, this description is neither necessary nor sufficient. It does not cover all cases of PTS1 proteins, and some proteins in accordance with this consensus do not target to the peroxisome. In order to find new PTS proteins in yeast and to arrive at a more complete description of the PTS1 consensus motif, we developed a machine learning approach that involves orthologue expansion of the set of known peroxisomal proteins. We performed a genome-wide in silico screen, characterised several PTS1-containing peptides and identified two new peroxisomal matrix proteins, which we named Pxp1 (Yel020c) and Pxp2 (Yjr111c). Based on these in silico and in vivo analyses, we revised the yeast PTS1 consensus which now includes all known PTS1 proteins.

show abstract