2013
DOI: 10.2174/09298665113209990063
|View full text |Cite
|
Sign up to set email alerts
|

Biomedical Hypothesis Generation by Text Mining and Gene Prioritization

Abstract: Text mining methods can facilitate the generation of biomedical hypotheses by suggesting novel associations between diseases and genes. Previously, we developed a rare-term model called RaJoLink (Petric et al, J. Biomed. Inform. 42(2): 219-227, 2009) in which hypotheses are formulated on the basis of terms rarely associated with a target domain. Since many current medical hypotheses are formulated in terms of molecular entities and molecular mechanisms, here we extend the methodology to proteins and genes, usi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
8
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
4
2
1
1

Relationship

0
8

Authors

Journals

citations
Cited by 9 publications
(8 citation statements)
references
References 101 publications
(70 reference statements)
0
8
0
Order By: Relevance
“…Since most of the LBD research performed in medicine, the most common term representation is using UMLS and MeSH (Lever et al, 2017;Preiss & Stevenson, 2017). Apart from these two medical resources, other medical databases such as Entrez Gene , HUGO (Petric et al, 2014), LocusLink (Hristovski et al, 2005), OMIM (Hristovski et al, 2003) and PharmGKB (Kim & Park, 2016) have also being used extract data units. LBD studies in other domains mainly consider word or word phrases (n-grams) as their term representation (Qi & Ohsawa, 2016) that have been extracted using techniques such as Part-Of-Speech (POS) tag patterns.…”
Section: What Are the Data Types Considered For Knowledge Discovery?mentioning
confidence: 99%
See 2 more Smart Citations
“…Since most of the LBD research performed in medicine, the most common term representation is using UMLS and MeSH (Lever et al, 2017;Preiss & Stevenson, 2017). Apart from these two medical resources, other medical databases such as Entrez Gene , HUGO (Petric et al, 2014), LocusLink (Hristovski et al, 2005), OMIM (Hristovski et al, 2003) and PharmGKB (Kim & Park, 2016) have also being used extract data units. LBD studies in other domains mainly consider word or word phrases (n-grams) as their term representation (Qi & Ohsawa, 2016) that have been extracted using techniques such as Part-Of-Speech (POS) tag patterns.…”
Section: What Are the Data Types Considered For Knowledge Discovery?mentioning
confidence: 99%
“…Network/Graph-based Measures: Network/graph-based measures analyse node and edgelevel attributes to score the associations. Examples of measures that represent this category include Degree centrality (Goodwin, Cohen & Rindflesch, 2012), Eigenvector centrality (Özgür et al, 2010), Closeness centrality (Özgür et al, 2011), Betweenness centrality (Özgür et al, 2010), Common Neighbours (Kastrin, Rindflesch & Hristovski, 2014), Jaccard Index (Kastrin, Rindflesch & Hristovski, 2014), Preferential Attachment (Kastrin, Rindflesch & Hristovski, 2014), Personalised PageRank (Petric et al, 2014), Personalised Diffusion Ranking (Petric et al, 2014), and Spreading Activation (Goodwin, Cohen & Rindflesch, 2012). Knowledge-based Measures: This category denotes the scoring measures such as MeSHbased Literature cohesiveness (Swanson, Smalheiser & Torvik, 2006), semantic type cooccurrence (Jha & Jin, 2016b), chemDB atomic count (Ijaz, Song & Lee, 2010), and chemDB XLogP (Ijaz, Song & Lee, 2010) that involve the knowledge from structured resources to rank the associations.…”
Section: What Are the Ranking/thresholding Mechanisms Used In Lbd Litmentioning
confidence: 99%
See 1 more Smart Citation
“…To explore whether APAP and pesticide exposure might influence ASD risk through similar underlying biological mechanisms, we conducted this study to perform database mining of the genetic associations of ASD in combination with network and functional analysis of candidate genes. Computer-based text mining methods have been used before to generate biomedical hypotheses on the impact of multiple exposures by examining novel associations between genes and diseases [ 34 ]. For example, literature mining and computational systems biology methods were used to explore the possible etiologic links between environmental chemicals, genes of interest, and type II diabetes [ 35 ].…”
Section: Introductionmentioning
confidence: 99%
“…Network/Graph-based Measures: Network/graph-based measures analyse node and edge-level attributes to score the associations. Examples of measures that represent this category include Degree centrality(Goodwin et al, 2012), Eigenvector centrality(Özgür et al, 2010), Closeness centrality(Özgür et al, 2011), Betweenness centrality(Özgür et al, 2010), Common Neighbours(Kastrin et al, 2014), Jaccard Index(Kastrin et al, 2014), Preferential Attachment(Kastrin et al, 2014), Personalised PageRank(Petric et al, 2014), Personalised Diffusion Ranking(Petric et al, 2014), and Spreading Activation(Goodwin et al, 2012).Knowledge-based Measures: This category denotes the scoring measures such as MeSH-based Literature cohesiveness(Swanson et al, 2006), semantic type co-occurrence(Jha and Jin, 2016b), chemDB atomic count(Ijaz et al, 2010), and chemDB XLogP(Ijaz et al, 2010) that involve the knowledge from structured resources to rank the associations. The advantage of these measures is that they entangle semantic aspects into consideration to decide the potentiality of the association.Relations-based Measures: Relations/predicate based measures (a sub-class of knowledge-based measures) analyse the relations extracted from resources such as SemRep to rank/threshold associations.…”
mentioning
confidence: 99%