Statistically, accurate protein identification is a fundamental cornerstone of proteomics and underpins the understanding and application of this technology across all elements of medicine and biology. Proteomics, as a branch of biochemistry, has in recent years played a pivotal role in extending and developing the science of accurately identifying the biology and interactions of groups of proteins or proteomes. Proteomics has primarily used mass spectrometry (MS)-based techniques for identifying proteins, although other techniques including affinity-based identifications still play significant roles. Here, we outline the basics of MS to understand how data are generated and parameters used to inform computational tools used in protein identification. We then outline a comprehensive analysis of the bioinformatics and computational methodologies used in protein identification in proteomics including discussing the most current communally acceptable metrics to validate any identification.
Post-translational modifications (PTMs) can occur soon after translation or at any stage in the lifecycle of a given protein, and they may help regulate protein folding, stability, cellular localisation, activity, or the interactions proteins have with other proteins or biomolecular species. PTMs are crucial to our functional understanding of biology, and new quantitative mass spectrometry (MS) and bioinformatics workflows are maturing both in labelled multiplexed and label-free techniques, offering increasing coverage and new opportunities to study human health and disease. Techniques such as Data Independent Acquisition (DIA) are emerging as promising approaches due to their re-mining capability. Many bioinformatics tools have been developed to support the analysis of PTMs by mass spectrometry, from prediction and identifying PTM site assignment, open searches enabling better mining of unassigned mass spectra—many of which likely harbour PTMs—through to understanding PTM associations and interactions. The remaining challenge lies in extracting functional information from clinically relevant PTM studies. This review focuses on canvassing the options and progress of PTM analysis for large quantitative studies, from choosing the platform, through to data analysis, with an emphasis on clinically relevant samples such as plasma and other body fluids, and well-established tools and options for data interpretation.
Histone deacetylases (HDAC) are metal-dependent enzymes and considered as important targets for cell functioning. Particularly, higher expression of class I HDACs is common in the onset of multiple malignancies which results in deregulation of many target genes involved in cell growth, differentiation and survival. Although substantial attempts have been made to control the irregular functioning of HDACs by employing various inhibitors with high sensitivity towards transformed cells, limited success has been achieved in epigenetic cancer therapy. Here in this study, we used ligand-based pharmacophore and 2-dimensional quantitative structure activity relationship (QSAR) modeling approaches for targeting class I HDAC isoforms. Pharmacophore models were generated by taking into account the known IC50 values and experimental energy scores with extensive validations. The QSAR model having an external R2 value of 0.93 was employed for virtual screening of compound libraries. 10 potential lead compounds (C1-C10) were short-listed having strong binding affinities for HDACs, out of which 2 compounds (C8 and C9) were able to interact with all members of class I HDACs. The potential binding modes of HDAC2 and HDAC8 to C8 were explored through molecular dynamics simulations. Overall, bioactivity and ligand efficiency (binding energy/non-hydrogen atoms) profiles suggested that proposed hits may be more effective inhibitors for cancer therapy.
Supplementary data is available at Bioinformatics online. Sample files are available at https://github.com/znoor/iSwathX and https://biolinfo.shinyapps.io/iSwathX.
Data-independent-acquisition mass spectrometry (DIA-MS) is a state-of-theart proteomic technique for high-throughput identification and quantification of peptides and proteins. Interpretation of DIA-MS data relies on the use of a spectral library, which is optimally created from data acquired from the same samples in data-dependent acquisition (DDA) mode. As DIA-MS quantification relies on the spectral libraries, having a high-quality, non-redundant, and comprehensive spectral library is essential. This article describes the major steps for creating a high-quality spectral library using a combination of multiple complementary search engines. We discuss appropriate strategies to control the false discovery rate for the final spectral library as a result of merging multiple searches.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.