Short title:Comparative analysis of protein-protein interaction databases.
AbstractProtein-protein interactions (PPIs) are critical, and so are the databases and tools (resources) concerning PPIs. But in absence of systematic comparisons, biologists/bioinformaticians may be forced to make a subjective selection among such protein interaction databases and tools. In fact, a comprehensive list of such bioinformatics resources has not been reported so far. For the first time, we compiled 375 PPI resources, short-listed and performed preliminary comparison of 125 important ones (both lists available publicly at startbioinfo.com), and then systematically compared human PPIs from 16 carefully-selected databases. General features have been first compared in detail. The coverage of 'experimentally verified' vs. all PPIs, as well as those significant in case of disease-associated and other types of genes among the chosen databases has been compared quantitatively. This has been done in two ways: outputs manually obtained using web-interfaces, and all interactions downloaded from the databases. For the first approach, PPIs obtained in response to gene queries using the web interfaces were compared. As a query set, 108 genes associated with different tissues (specific to kidney, testis, and uterus, and ubiquitous) or diseases (breast cancer, lung cancer, Alzheimer's, cystic fibrosis, diabetes, and cardiomyopathy) were chosen. PPIcoverage for well-studied genes was also compared with that of less-studied ones. For the second approach, the back-end-data from the databases was downloaded and compared. Based on the results, we recommend the use of STRING and UniHI for retrieving the majority of 'experimentally verified' protein interactions, and hPRINT and STRING for obtaining maximum number of 'total' (experimentally verified as well as predicted) PPIs. The analysis of experimentally verified PPIs found exclusively in each database revealed that STRING contributed about 71% of exclusive hits. Overall, hPRINT, STRING and IID together retrieved ~94% of 'total' protein interactions available in the databases. The coverage of certain databases was skewed for some gene-types. The results also indicate that the database usage frequency may not correlate with their advantages, thereby justifying the need for more frequent studies of this nature.
Studying the molecular basis of Non-Obstructive Azoospermia (NOA), a type of male infertility with failed spermatogenesis at various stages, can also help in exploring molecular basis of human spermatogenesis and possibly pave way to identify new targets for male contraceptive development. Hence, we initiated a functional genomics study by applying RNA-seq. Testicular biopsies collected from donors with Non-Obstructive Azoospermia (NOA), Obstructive Azoospermia (OA), Congenital Bilateral Absence of the Vas Deferens (CBAVD), and Varicocele (VA) conditions. Strong association of 100+ genes with human spermatogenesis and NOA has been detected via NGS-based transcriptomic analysis. In addition, 20 RNA molecules have been short-listed for potential diagnostic applications (non-obstructive azoospermia vs. obstructive azoospermia, varicocele or normal). A hierarchical list of several genes and alternatively spliced mRNAs, transcribed differentially in NOA, is reported -based on a 'strength of association'. Such association with NOA, spermatogenesis or both is a new finding for many genes as revealed by a comparison with a newly prepared comprehensive list of genes having such association with human spermatogenesis/NOA. Many top-ranking genes involved in viral gene expression were up-regulated in testes from NOA-patients, while those associated with an antiviral mechanism were down-regulated. A tangential finding: while most well-established control mRNAs did not qualify, two new ones worked best in RT-qPCR experiments. Needle-aspiration of testicular biopsies, followed by the use of short-listed promising candidate biomarkers (i.e., 16 mRNA & 4 chimeric transcripts) and control mRNAs in RT-qPCR-based diagnostic assays, may help to avoid open surgeries in future.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.