2019
DOI: 10.1101/651042
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Metric Learning on Expression Data for Gene Function Prediction

Abstract: Motivation: Co-expression of two genes across different conditions is indicative of their involvement in the same biological process. However, using RNA-Seq datasets with many experimental conditions from diverse sources introduces batch effects and other artefacts that might obscure the real co-expression signal. Moreover, only a subset of experimental conditions is expected to be relevant for finding genes related to a particular Gene Ontology (GO) term. Therefore, we hypothesize that when the purpose is to … Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
5

Citation Types

0
14
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 7 publications
(14 citation statements)
references
References 26 publications
0
14
0
Order By: Relevance
“…Several computational approaches to predicting protein functions have been developed ( 1 ). These approaches rely on different types of information that can be used to predict protein functions, including the protein sequence ( 2 , 3 ), interaction networks ( 4 ), gene expression ( 5 ), sequence similarity ( 6 ), phenotypes resulting from loss of function mutations ( 7 ) and text mining ( 8 ). The different types of information can often provide complementary information and, consequently, combining multiple types of information can often improve predictive performance ( 1 ).…”
Section: Introductionmentioning
confidence: 99%
“…Several computational approaches to predicting protein functions have been developed ( 1 ). These approaches rely on different types of information that can be used to predict protein functions, including the protein sequence ( 2 , 3 ), interaction networks ( 4 ), gene expression ( 5 ), sequence similarity ( 6 ), phenotypes resulting from loss of function mutations ( 7 ) and text mining ( 8 ). The different types of information can often provide complementary information and, consequently, combining multiple types of information can often improve predictive performance ( 1 ).…”
Section: Introductionmentioning
confidence: 99%
“…Representing gene coexpression as networks eases the study and visualisation of the expression data (Weirauch, 2011;Magwene and Kim, 2004). One motivation behind creating these networks is that genes which are coexpressed across multiple samples are likely to have related functions (Hughes et al, 2000;Stuart et al, 2003;van Noort et al, 2003;Makrodimitris et al, 2020), allowing inference of gene function using guilt by association approaches (Wolfe et al, 2005). This procedure is especially useful if the studied organism is poorly annotated.…”
Section: Introductionmentioning
confidence: 99%
“…[11][12][13][14][15][16][17][18] These protein-surface systems include either small extracellular matrix (ECM) proteins or ECM protein domains that are adsorbed onto hydroxyapatite (HAP) crystals. 14,[19][20][21][22][23][24][25][26][27][28][29] These studies suggest that protein secondary structure can change when adsorbed to surfaces. They also hypothesize that the interaction of amino acid side-chains with inorganic surfaces are dependent on the secondary structures of adsorbed proteins.…”
Section: Introductionmentioning
confidence: 99%
“…They also hypothesize that the interaction of amino acid side-chains with inorganic surfaces are dependent on the secondary structures of adsorbed proteins. 20,22,24,25,27,30 Thus, all components of protein-surface interaction can be determined only when the structure of the adsorbed protein is resolved.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation