Mass spectrometry‐based proteomics is a popular and powerful method for precise and highly multiplexed protein identification. The most common method of analyzing untargeted proteomics data is called database searching, where the database is simply a collection of protein sequences from the target organism, derived from genome sequencing. Experimental peptide tandem mass spectra are compared to simplified models of theoretical spectra calculated from the translated genomic sequences. However, in several interesting application areas, such as forensics, archaeology, venomics, and others, a genome sequence may not be available, or the correct genome sequence to use is not known. In these cases, de novo peptide identification can play an important role. De novo methods infer peptide sequence directly from the tandem mass spectrum without reference to a sequence database, usually using graph‐based or machine learning algorithms. In this review, we provide a basic overview of de novo peptide identification methods and applications, briefly covering de novo algorithms and tools, and focusing in more depth on recent applications from venomics, metaproteomics, forensics, and characterization of antibody drugs.
Dirigent proteins (DPs) were first discovered from Forsythia stems, but all of the co-purified proteins were unknown. De novo sequencing and native mass spectrometry identified additional proteins and heterocomplexes between two DP homologs.
Ricin, a protein found in castor seeds, is a lethal toxin that is designated as a category 2 select agent, and cases of attempted ricin poisoning are relatively common. Many methods to detect protein toxins such as ricin use targeted liquid chromatography–tandem mass spectrometry (LC–MS/MS) to identify toxin peptides, usually tryptic peptides. The successful use of untargeted methods has also been reported. However, the use of untargeted proteomics methods, including database search, for peptide and protein identification is less common in forensic practice and may be unfamiliar to forensic science practitioners. Here, we propose a method to create spectral libraries of tryptic ricin peptides and use these libraries for ricin identification by spectral library search, which may be more familiar to forensic scientists because of the use of spectral libraries in small molecule identification. Peptide spectral libraries offer a direct comparison to an authentic standard, a key element of forensic analysis, but have not previously been used in a forensic context. To construct these spectral libraries, two pure ricin samples (one from a proposed standard reference material) were digested with trypsin and analyzed using a standard shotgun LC–MS/MS protocol. Spectral libraries were created from resulting tryptic peptides identified from filtered search results from four database search tools. The library was then used in a search using SpectraST on forensically realistic castor seed extracts. These castor seed samples were made using the crude methods commonly encountered in real-world ricin cases. Analysis showed that the spectral library search resulted in more peptides identified from crude castor seed samples compared to MS-GF+ and Sequest plus Percolator database searches. These results, the first published use of spectral library search to detect protein toxins in forensically relevant samples, suggest that computational comparison of putative ricin peptide spectra to library spectra can be an effective method to detect ricin in an unknown sample. Data are available via ProteomeXchange with identifier PXD013711.
The discovery of dirigent proteins (DPs) and their functions in plant phenol biochemistry was made over two decades ago with Forsythia × intermedia. Stereo-selective, DP-guided, monolignol-derived radical coupling in vitro was then reported to afford the optically active lignan, (+)-pinoresinol from coniferyl alcohol, provided one-electron oxidase/oxidant capacity was present. It later became evident that DPs have several distinct sub-families. In vascular plants, DPs hypothetically function, along with other essential enzymes/proteins (e.g. oxidases), as part of lignin/lignan forming complexes (LFCs). Herein, we used an integrated bottom-up, top-down, and native mass spectrometry approach to detect potential interacting proteins in a DP-enriched solubilized protein fraction from Forsythia × intermedia, via adaptation of our initial report of DP solubilization and purification. Because this hybrid species lacks a published genome, de novo sequencing was performed using publicly available transcriptome and genomic data from closely related species. We detected and identified two new DP homologs, which appear to form hetero-trimers. Molecular dynamics simulations suggest that similar hetero-trimers were possible between Arabidopsis DP homologs with comparable sequence similarity. Other identified proteins in the DPenriched preparation were putatively associated with DP function or the cell wall. Although their cooccurrence after extraction and chromatographic separation is suggestive for components of a protein complex in vivo, none were found to form stable complexes with DPs under the specific experimental conditions we have explored. Nevertheless, our integrated mass spectrometry method development helps prepare for future investigations directed to detect hypothetical LFCs and other related complexes isolated from plant biomass fractionation.
The discovery of dirigent proteins (DPs) and their functions in plant phenol biochemistry was made over two decades ago with Forsythia × intermedia. Stereo-selective, DP-guided, monolignol-derived radical coupling in vitro was then reported to afford the optically active lignan, (+)-pinoresinol from coniferyl alcohol, provided one-electron oxidase/oxidant capacity was present. It later became evident that DPs have several distinct sub-families. In vascular plants, DPs hypothetically function, along with other essential enzymes/proteins (e.g. oxidases), as part of lignin/lignan forming complexes (LFCs). Herein, we used an integrated bottom-up, top-down, and native mass spectrometry approach to detect potential interacting proteins in a DP-enriched solubilized protein fraction from Forsythia × intermedia, via adaptation of our initial report of DP solubilization and purification. Because this hybrid species lacks a published genome, de novo sequencing was performed using publicly available transcriptome and genomic data from closely related species. We detected and identified two new DP homologs, which appear to form hetero-trimers. Molecular dynamics simulations suggest that similar hetero-trimers were possible between Arabidopsis DP homologs with comparable sequence similarity. Other identified proteins in the DP-enriched preparation were putatively associated with DP function or the cell wall. Although their co-occurrence after extraction and chromatographic separation is suggestive for components of a protein complex in vivo, none were found to form stable complexes with DPs under the specific experimental conditions we have explored. Nevertheless, our integrated mass spectrometry method development helps prepare for future investigations directed to detect hypothetical LFCs and other related complexes isolated from plant biomass fractionation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.