Lactobacillus plantarum is a ubiquitous microorganism that is able to colonize several ecological niches, including vegetables, meat, dairy substrates and the gastro-intestinal tract. An extensive phenotypic and genomic diversity analysis was conducted to elucidate the molecular basis of the high flexibility and versatility of this species. First, 185 isolates from diverse environments were phenotypically characterized by evaluating their fermentation and growth characteristics. Strains clustered largely together within their particular food niche, but human fecal isolates were scattered throughout the food clusters, suggesting that they originate from the food eaten by the individuals. Based on distinct phenotypic profiles, 24 strains were selected and, together with a further 18 strains from an earlier low-resolution study, their genomic diversity was evaluated by comparative genome hybridization against the reference genome of L. plantarum WCFS1. Over 2000 genes were identified that constitute the core genome of the L. plantarum species, including 121 unique L. plantarum-marker genes that have not been found in other lactic acid bacteria. Over 50 genes unique for the reference strain WCFS1 were identified that were absent in the other L. plantarum strains. Strains of the L. plantarum subspecies argentoratensis were found to lack a common set of 24 genes, organized in seven gene clusters/operons, supporting their classification as a separate subspecies. The results provide a detailed view on phenotypic and genomic diversity of L. plantarum and lead to a better comprehension of niche adaptation and functionality of the organism.
BackgroundLinking phenotypes to high-throughput molecular biology information generated by ~omics technologies allows revealing cellular mechanisms underlying an organism's phenotype. ~Omics datasets are often very large and noisy with many features (e.g., genes, metabolite abundances). Thus, associating phenotypes to ~omics data requires an approach that is robust to noise and can handle large and diverse data sets.ResultsWe developed a web-tool PhenoLink (http://bamics2.cmbi.ru.nl/websoftware/phenolink/) that links phenotype to ~omics data sets using well-established as well new techniques. PhenoLink imputes missing values and preprocesses input data (i) to decrease inherent noise in the data and (ii) to counterbalance pitfalls of the Random Forest algorithm, on which feature (e.g., gene) selection is based. Preprocessed data is used in feature (e.g., gene) selection to identify relations to phenotypes. We applied PhenoLink to identify gene-phenotype relations based on the presence/absence of 2847 genes in 42 Lactobacillus plantarum strains and phenotypic measurements of these strains in several experimental conditions, including growth on sugars and nitrogen-dioxide production. Genes were ranked based on their importance (predictive value) to correctly predict the phenotype of a given strain. In addition to known gene to phenotype relations we also found novel relations.ConclusionsPhenoLink is an easily accessible web-tool to facilitate identifying relations from large and often noisy phenotype and ~omics datasets. Visualization of links to phenotypes offered in PhenoLink allows prioritizing links, finding relations between features, finding relations between phenotypes, and identifying outliers in phenotype data. PhenoLink can be used to uncover phenotype links to a multitude of ~omics data, e.g., gene presence/absence (determined by e.g.: CGH or next-generation sequencing), gene expression (determined by e.g.: microarrays or RNA-seq), or metabolite abundance (determined by e.g.: GC-MS).
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.