Note: Using Causality to Mine Sjögren’s Syndrome related Factors from Medical Literature

Gujarathi, Pranav; Reddy, Sai Krishna Reddy Gopi; Karri, Venkata Mani Babu; Bhimireddy, Ananth Reddy; Rajapuri, Anushri Singh; Reddy, Manohar; Sabbani, Mounika; Cheriyan, Biju; VanSchaik, Jack T.; Thyvalikakath, Thankam P.; Chakraborty, Sunandan

doi:10.1145/3530190.3534850

ACM SIGCAS/SIGCHI Conference on Computing and Sustainable Societies (COMPASS) 2022

DOI: 10.1145/3530190.3534850

|View full text |Cite

Note: Using Causality to Mine Sjögren’s Syndrome related Factors from Medical Literature

Pranav Gujarathi

Sai Krishna Reddy Gopi Reddy

Venkata Mani Babu Karri

et al.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

Supporting

Mentioning

Contrasting

Year Published

2024

Publication Types

Select...

Preprint1

Relationship

Self Cite0

Independent1

Authors

Journals

Cited by 1 publication

References 29 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

Beyond associations: A benchmark Causal Relation Extraction Dataset (CRED) of disease-causing genes, its comparative evaluation, interpretation and application

Bansal,

Sri Dhinesh,

Pathak

et al. 2024

Preprint

View full text Add to dashboard Cite

Information on causal relationships is essential to many sciences (including biomedical science, where knowing if a gene-disease relation is causal vs. merely associative can lead to better treatments); and can foster research on causal side-information-based machine learning as well. Automatically extracting causal relations from large text corpora remains less explored though, despite much work on Relation Extraction (RE). The few existing CRE (Causal RE) studies are limited to extracting causality within a sentence or for a particular disease, mainly due to the lack of a diverse benchmark dataset. Here, we carefully curate a new CRE Dataset (CRED) of 3553 (causal and non-causal) gene-disease pairs, spanning 284 diseases and 500 genes, within or across sentences of 267 published abstracts. CRED is assembled in two phases to reduce class imbalance and its inter-annotator agreement is 89%. To assess CRED's utility in classifying causal vs. non-causal pairs, we compared multiple classifiers and found SVM to perform the best (F1 score 0.70). Both in terms of classifier performance and model interpretability (i.e., whether the model focuses importance/attention on words with causal connotations in abstracts), CRED outperformed a state-of-the-art RE dataset. To move from benchmarks to real-world settings, our CRED-trained classification model was applied on all PubMed abstracts on Parkinson's disease (PD). Genes predicted to be causal for PD by our model in at least 50 abstracts got validated in textbook sources. Besides these well-studied genes, our model revealed less-studied genes that could be explored further. Our systematically curated and evaluated CRED, and its associated classification model and CRED-wide gene-disease causality scores, thus offer concrete resources for advancing future research in CRE from biomedical literature.

show abstract

Beyond associations: A benchmark Causal Relation Extraction Dataset (CRED) of disease-causing genes, its comparative evaluation, interpretation and application

Bansal,

Sri Dhinesh,

Pathak

et al. 2024

Preprint

View full text Add to dashboard Cite

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Note: Using Causality to Mine Sjögren’s Syndrome related Factors from Medical Literature

Cited by 1 publication

References 29 publications

Beyond associations: A benchmark Causal Relation Extraction Dataset (CRED) of disease-causing genes, its comparative evaluation, interpretation and application

Beyond associations: A benchmark Causal Relation Extraction Dataset (CRED) of disease-causing genes, its comparative evaluation, interpretation and application

Contact Info

Product

Resources

About