High throughput assays allow researchers to identify sets of genes related to experimental conditions or phenotypes of interest. These gene sets are frequently subjected to functional interpretation using databases of gene annotations. Recent approaches have extended this approach to also consider networks of genegene relationships and interactions when attempting to characterize properties of a gene set. We present here a supervised learning algorithm for gene set analysis, called 'GeneSet MAPR', that for the first time explicitly considers the patterns of direct as well as indirect relationships present in the network to quantify gene-gene similarities and then report shared properties of the gene set. Our extensive evaluations show that GeneSet MAPR performs better than other network-based methods for the task of identifying genes related to a given gene set, enabling more reliable functional characterizations of the gene set. When applied to the set of response-associated genes from a triple negative breast cancer study, GeneSet MAPR uncovers gene families such as claudins, kallikreins, and collagen type alpha chains related to patient's response to treatment, and which are not uncovered with traditional analysis.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.