BackgroundHigh-accuracy prediction tools are essential in the post-genomic era to define organellar proteomes in their full complexity. We recently applied a discriminative machine learning approach to predict plant proteins carrying peroxisome targeting signals (PTS) type 1 from genome sequences. For Arabidopsis thaliana 392 gene models were predicted to be peroxisome-targeted. The predictions were extensively tested in vivo, resulting in a high experimental verification rate of Arabidopsis proteins previously not known to be peroxisomal.ResultsIn this study, we experimentally validated the predictions in greater depth by focusing on the most challenging Arabidopsis proteins with unknown non-canonical PTS1 tripeptides and prediction scores close to the threshold. By in vivo subcellular targeting analysis, three novel PTS1 tripeptides (QRL>, SQM>, and SDL>) and two novel tripeptide residues (Q at position −3 and D at pos. -2) were identified. To understand why, among many Arabidopsis proteins carrying the same C-terminal tripeptides, these proteins were specifically predicted as peroxisomal, the residues upstream of the PTS1 tripeptide were computationally permuted and the changes in prediction scores were analyzed. The newly identified Arabidopsis proteins were found to contain four to five amino acid residues of high predicted targeting enhancing properties at position −4 to −12 in front of the non-canonical PTS1 tripeptide. The identity of the predicted targeting enhancing residues was unexpectedly diverse, comprising besides basic residues also proline, hydroxylated (Ser, Thr), hydrophobic (Ala, Val), and even acidic residues.ConclusionsOur computational and experimental analyses demonstrate that the plant PTS1 tripeptide motif is more diverse than previously thought, including an increasing number of non-canonical sequences and allowed residues. Specific targeting enhancing elements can be predicted for particular sequences of interest and are far more diverse in amino acid composition and positioning than previously assumed. Machine learning methods become indispensable to predict which specific proteins, among numerous candidate proteins carrying the same non-canonical PTS1 tripeptide, contain sufficient enhancer elements in terms of number, positioning and total strength to cause peroxisome targeting.
Our knowledge of the proteome of plant peroxisomes and their functional plasticity is far from being complete, primarily due to major technical challenges in experimental proteome research of the fragile cell organelle. Several unexpected novel plant peroxisome functions, for instance in biotin and phylloquinone biosynthesis, have been uncovered recently. Nevertheless, very few regulatory and membrane proteins of plant peroxisomes have been identified and functionally described up to now. To define the matrix proteome of plant peroxisomes, computational methods have emerged as important powerful tools. Novel prediction approaches of high sensitivity and specificity have been developed for peroxisome targeting signals type 1 (PTS1) and have been validated by in vivo subcellular targeting analyses and thermodynamic binding studies with the cytosolic receptor, PEX5. Accordingly, the algorithms allow the correct prediction of many novel peroxisome-targeted proteins from plant genome sequences and the discovery of additional organelle functions. In this review, we provide an overview of methodologies, capabilities and accuracies of available prediction algorithms for PTS1 carrying proteins. We also summarize and discuss recent quantitative, structural and mechanistic information of the interaction of PEX5 with PTS1 carrying proteins in relation to in vivo import efficiency. With this knowledge, we develop a model of how proteins likely evolved peroxisomal targeting signals in the past and still nowadays, in which order the two import pathways might have evolved in the ancient eukaryotic cell, and how the secondary loss of the PTS2 pathway probably happened in specific organismal groups.
Most peroxisomal matrix proteins possess a C-terminal targeting signal type 1 (PTS1).Accurate prediction of functional PTS1 sequences and their relative strength by computational methods is essential for determination of peroxisomal proteomes in silico, but has proved challenging, due to high sequence variability of non-canonical targeting signals, particularly in higher plants, and low availability of experimentally validated non-canonical examples. In this study in silico predictions were compared with in vivo targeting analyses 2 and in vitro thermodynamic binding of mutated variants within the context of one model targeting sequence. There was broad agreement between the methods for entire PTS1 domains and position-specific single amino acid (aa) residues, including residues upstream of the PTS1 tripeptide. The hierarchy Leu>Met>Ile>Val at the C-terminal position was determined for all methods but both experimental approaches suggest Tyr is under weighted in the prediction algorithm due to the absence of this residue in the positive training dataset.A combination of methods better defines the score range that discriminates a functional PTS1. In vitro binding to the PEX5 receptor could discriminate amongst strong targeting signals whilst in vivo targeting assays were more sensitive, allowing detection of weak functional import signals that were below the limit of detection in the binding assay. Together the data provide a comprehensive assessment of the factors driving PTS1 efficacy and provide a framework for the more quantitative assessment of the protein import pathway in higher plants.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.