Zhen-Lin Chen scite author profile

We describe pLink 2, a search engine with higher speed and reliability for proteome-scale identification of cross-linked peptides. With a two-stage open search strategy facilitated by fragment indexing, pLink 2 is ~40 times faster than pLink 1 and 3~10 times faster than Kojak. Furthermore, using simulated datasets, synthetic datasets, 15 N metabolically labeled datasets, and entrapment databases, four analysis methods were designed to evaluate the credibility of ten state-of-the-art search engines. This systematic evaluation shows that pLink 2 outperforms these methods in precision and sensitivity, especially at proteome scales. Lastly, re-analysis of four published proteome-scale cross-linking datasets with pLink 2 required only a fraction of the time used by pLink 1, with up to 27% more cross-linked residue pairs identified. pLink 2 is therefore an efficient and reliable tool for cross-linking mass spectrometry analysis, and the systematic evaluation methods described here will be useful for future software development.

show abstract

Mapping disulfide bonds from sub-micrograms of purified proteins or micrograms of complex protein mixtures

Cao

Fan

et al. 2018

Biophys Rep

View full text Add to dashboard Cite

Disulfide bonds are vital for protein functions, but locating the linkage sites has been a challenge in protein chemistry, especially when the quantity of a sample is small or the complexity is high. In 2015, our laboratory developed a sensitive and efficient method for mapping protein disulfide bonds from simple or complex samples (Lu et al. in Nat Methods 12:329, 2015). This method is based on liquid chromatography–mass spectrometry (LC–MS) and a powerful data analysis software tool named pLink. To facilitate application of this method, we present step-by-step disulfide mapping protocols for three types of samples—purified proteins in solution, proteins in SDS-PAGE gels, and complex protein mixtures in solution. The minimum amount of protein required for this method can be as low as several hundred nanograms for purified proteins, or tens of micrograms for a mixture of hundreds of proteins. The entire workflow—from sample preparation to LC–MS and data analysis—is described in great detail. We believe that this protocol can be easily implemented in any laboratory with access to a fast-scanning, high-resolution, and accurate-mass LC–MS system.

show abstract

How to use open-pFind in deep proteomics data analysis?— A protocol for rigorous identification and quantitation of peptides and proteins from mass spectrometry data

Shao¹,

Cao²,

Chen³

et al. 2021

View full text Add to dashboard Cite

High-throughput proteomics based on mass spectrometry (MS) analysis has permeated biomedical science and propelled numerous research projects. pFind 3 is a database search engine for high-speed and in-depth proteomics data analysis. pFind 3 features a swift open search workflow that is adept at uncovering less obvious information such as unexpected modifications or mutations that would have gone unnoticed using a conventional data analysis pipeline. In this protocol, we provide step-by-step instructions to help users mastering various types of data analysis using pFind 3 in conjunction with pParse for data pre-processing and if needed, pQuant for quantitation. This streamlined pParse-pFind-pQuant workflow offers exceptional sensitivity, precision, and speed. It can be easily implemented in any laboratory in need of identifying peptides, proteins, or post-translational modifications, or of quantitation based on 15 N-labeling, SILAC-labeling, or TMT/iTRAQ labeling.

show abstract

pDeepXL: MS/MS Spectrum Prediction for Cross-Linked Peptide Pairs by Deep Learning

et al. 2021

View full text Add to dashboard Cite

In cross-linking mass spectrometry, the identification of cross-linked peptide pairs heavily relies on the ability of a database search engine to measure the similarities between experimental and theoretical MS/MS spectra. However, the lack of accurate ion intensities in theoretical spectra impairs the performance of search engines, in particular, on proteome scales. Here we introduce pDeepXL, a deep neural network to predict MS/MS spectra of cross-linked peptide pairs. To train pDeepXL, we used the transfer-learning technique because it facilitated the training with limited benchmark data of cross-linked peptide pairs. Test results on more than ten data sets showed that pDeepXL accurately predicted the spectra of both noncleavable DSS/BS3/Leiker cross-linked peptide pairs (>80% of predicted spectra have Pearson’s r values higher than 0.9) and cleavable DSSO/DSBU cross-linked peptide pairs (>75% of predicted spectra have Pearson’s r values higher than 0.9). pDeepXL also achieved the accurate prediction on unseen data sets using an online fine-tuning technique. Lastly, integrating pDeepXL into a database search engine increased the number of identified cross-link spectra by 18% on average.

show abstract

Comparative Analysis of Chemical Cross-Linking Mass Spectrometry Data Indicates That Protein STY Residues Rarely React with N-Hydroxysuccinimide Ester Cross-Linkers

et al. 2023

View full text Add to dashboard Cite

When it comes to mass spectrometry data analysis for identification of peptide pairs linked by N-hydroxysuccinimide (NHS) ester cross-linkers, search engines bifurcate in their setting of cross-linkable sites. Some restrict NHS ester cross-linkable sites to lysine (K) and protein N-terminus, referred to as K only for short, whereas others additionally include serine (S), threonine (T), and tyrosine (Y) by default. Here, by setting amino acids with chemically inert side chains such as glycine (G), valine (V), and leucine (L) as cross-linkable sites, which serves as a negative control, we show that software-identified STY-cross-links are only as reliable as GVL-cross-links. This is true across different NHS ester cross-linkers including DSS, DSSO, and DSBU, and across different search engines including MeroX, xiSearch, and pLink. Using a published data set originated from synthetic peptides, we demonstrate that STY-cross-links indeed have a high false discovery rate. Further analysis revealed that depending on the data and the search engine used to analyze the data, up to 65% of the STY-cross-links identified are actually K−K cross-links of the same peptide pairs, up to 61% are actually K-mono-links, and the rest tend to contain short peptides at high risk of false identification.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Zhen-Lin Chen

A high-speed search engine pLink 2 with systematic evaluation for proteome-scale identification of cross-linked peptides

Mapping disulfide bonds from sub-micrograms of purified proteins or micrograms of complex protein mixtures

How to use open-pFind in deep proteomics data analysis?— A protocol for rigorous identification and quantitation of peptides and proteins from mass spectrometry data

pDeepXL: MS/MS Spectrum Prediction for Cross-Linked Peptide Pairs by Deep Learning

Comparative Analysis of Chemical Cross-Linking Mass Spectrometry Data Indicates That Protein STY Residues Rarely React with N-Hydroxysuccinimide Ester Cross-Linkers

Contact Info

Product

Resources

About