The massive amount of data generated from genome sequencing brings tons of newly identified mutations, whose pathogenic/non-pathogenic effects need to be evaluated. This has given rise to several mutation predictor tools that, in general, do not consider the specificities of the various protein groups. We aimed to develop a predictor tool dedicated to membrane proteins, under the premise that their specific structural features and environment would give different responses to mutations compared to globular proteins. For this purpose, we created TMSNP, a database that currently contains information from 2624 pathogenic and 196 705 non-pathogenic reported mutations located in the transmembrane region of membrane proteins. By computing various conservation parameters on these mutations in combination with annotations, we trained a machine-learning model able to classify mutations as pathogenic or not. TMSNP (freely available at http://lmc.uab.es/tmsnp/) improves considerably the prediction power of commonly used mutation predictors trained with globular proteins.
The massive amount of data generated from genome sequencing have given rise to several mutation predictor tools although no mutation database or predictor tool have been developed specifically for the transmembrane region of membrane proteins.We present TMSNP, a database that currently contains information from 2624 pathogenic and 195964 non-pathogenic reported mutations located on the TM region of membrane proteins. The computed conservation parameters and annotations on these mutations were used to train a machine-learning model that classifies TM mutations as pathogenic or non-pathogenic. The presented tool improves considerably the prediction power of commonly used mutation predictors and additionally represents the first mutation prediction tool specific for TM mutations.TMSNP is available at
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.