SummaryEmpirical clinical studies on the human interactome and phenome not only illustrates prevalent phenotypic overlap and genetic overlap between diseases, but also reveals a modular organization of the genetic landscape of human disease, provding new opportunities to reduce the complexity in dissecting the phenotype-genotype association. We here introduce a network-module based method towards phenotype-genotype association inference and disease gene identification. This approach incorporates protein-protein interaction network, phenotype similarity network and known phenotype-genotype associations into an assembled network. We then decomposes the resulted network into modules (or communities) wherein we identified and prioritized the disease genes from the candidates within the loci associated with the query disease using a linear regression model and concordance score. For the known phenotype-gene associations in the OMIM database, we used the leave-one-out validation to evaluate the feasibility of our method, and successfully ranked known disease genes at top 1 in 887 out of 1807 cases. Moreover, applying this approach on 850 OMIM loci characterized by an unknown molecular basis, we propose high-probability candidates for 81 genetic diseases.
Inspired by recent discovery that human disease phenome shows a modular organization on the genetic landscape, we introduce a network-module based method towards phenotype-genotype association inference and disease gene identification. This approach integrates protein-protein interaction network, phenotype similarity network and known phenotype-genotype associations, and then decomposes the resulted assembled network into modules (or communities) wherein we identified and prioritized the disease genes from the candidates within the loci associated with the query disease using a linear regression model and concordance score. For the known phenotype-gene associations in the OMIM database, we used the leave-one-out validation to evaluate the feasibility of our method, and successfully ranked known disease genes at top 1 in 887 out of 1807 cases. Moreover, applying this approach on 850 OMIM loci characterized by an unknown molecular basis, we propose high-probability candidates for 81 genetic diseases.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.