Metaproteomics has emerged as one of the most promising approaches for determining the composition and metabolic functions of complete microbial communities. Conventional metaproteomics approaches however, rely on the construction of protein sequence databases and efficient peptide-spectrum matching algorithms. Thereby, very large sequence databases impact on computational efforts and sensitivity. More recently, advanced de novo sequencing strategies - which annotate peptide sequences without the requirement for a database - have become (again) increasingly proposed for proteomics applications. Such approaches would vastly expand many metaproteomics applications by enabling rapid community profiling and by capturing unsequenced community members, which otherwise remain inaccessible for further interpretation. Nevertheless, because of the lack of efficient pipelines and validation procedures, those strategies have only rarely been employed for community proteomics.
Here we report on a newly established de novo metaproteomics pipeline which was evaluated for its quantitative performance using synthetic and natural communities. Additionally, we introduce a novel validation strategy and investigate the actual content of community members within community proteomics data.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.