SciNLI: A Corpus for Natural Language Inference on Scientific Text

Sadat, Mobashir; Caragea, Cornelia

doi:10.18653/v1/2022.acl-long.511

Cited by 9 publications

(5 citation statements)

References 31 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…We name the experiment settings as HEALTHVER pred and HEALTHVER truth respectively. We also experiment on another dataset SciNLI (Sadat and Caragea, 2022) without any annotation. The structures in SciNLI are derived from the parsing model trained on HEALTHVER.…”

Section: Methodsmentioning

confidence: 99%

“…We proceed to train a joint entity and relation extraction model for extracting the sentence structures. Additionally, we utilize the extraction model to conduct experiments on another dataset SciNLI (Sadat and Caragea, 2022).…”

Section: Graph Representation and Reasoningmentioning

confidence: 99%

“…With the hypothesis, the objective of this paper is to examine the role of qualitative causal structure to characterizing and verifying scientific claims. To achieve this goal, we employ the schema introduced in SciClaim ( Magnusson and Friedman, 2021) to construct simplified qualitative causal structures for claim-evidence pairs in the datasets HEALTHVER (Sarrouti et al, 2021) and SCINLI (Sadat and Caragea, 2022). These qualitative causal structures are then organized as heterogeneous graphs to enable reasoning.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Characterizing and Verifying Scientific Claims: Qualitative Causal Structure is All You Need

Wu,

Chao,

Zhou

et al. 2023

Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

View full text Add to dashboard Cite

A scientific claim typically begins with the formulation of a research question or hypothesis, which is a tentative statement or proposition about a phenomenon or relationship between variables. Within the realm of scientific claim verification, considerable research efforts have been dedicated to attention architectures and leveraging the text comprehension capabilities of Pre-trained Language Models (PLMs), yielding promising performances. However, these models overlook the causal structure information inherent in scientific claims, thereby failing to establish a comprehensive chain of causal inference. This paper delves into the exploration to highlight the crucial role of qualitative causal structure in characterizing and verifying scientific claims based on evidence. We organize the qualitative causal structure into a heterogeneous graph and propose a novel attentionbased graph neural network model to facilitate causal reasoning across relevant causallypotent factors. Our experiments demonstrate that by solely utilizing the qualitative causal structure, the proposed model achieves comparable performance to PLM-based models. Furthermore, by incorporating semantic features, our model outperforms state-of-the-art approaches comprehensively. 1

show abstract

Section: Methodsmentioning

confidence: 99%

Section: Graph Representation and Reasoningmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Characterizing and Verifying Scientific Claims: Qualitative Causal Structure is All You Need

Wu,

Chao,

Zhou

et al. 2023

Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

View full text Add to dashboard Cite

show abstract

“…To bolster the data set’s comprehensiveness, existing NLI data sets were incorporated. Specifically, the Stanford Natural Language Inference (SNLI) data set 27 and the SciNLI data set 28 were integrated. These data sets contributed a diverse range of general NLI instances, enriching the model’s ability to handle a wider spectrum of language structures and inferences.…”

Section: Data Set Preparation For Training and Validationmentioning

confidence: 99%

MelAnalyze: Fact-Checking Melatonin claims using Large Language Models and Natural Language Inference

Karkera,

Ghosh,

Escames

et al. 2024

Preprint

View full text Add to dashboard Cite

With the explosion of health related information in mainstream discourse, distinguishing accurate health-related claims from misinformation is important. Using computational tools and algorithms to help is key. Our focus in this paper is on the hormone Melatonin which is claimed to have broad health benefits and largely sold as a supplement. This paper introduces 'MelAnalyze,' a framework for using generative and transformer-based deep learning models adapted as a natural language inference (NLI) task, to semi-automate the fact-checking of general melatonin claims. MelAnalyze is built upon a comprehensive collection of melatonin-related scientific abstracts from PubMed for validation. The framework incorporates components for precise extraction of information from scientific literature, semantic similarity and NLI. At its core, MelAnalyze leverages pre trained NLI models that are fine-tuned on melatonin-specific claims along with semantic search based on vectorized representation of the articles. The best models, fine-tuned on LLaMA1 and RoBERTa, attain good precision, recall, and F1-scores of approximately 0.92. We also introduce a user-friendly web-based tool for fact-checking algorithm evaluation and use. In summary, we show MelAnalyze's role in empowering users and researchers to assess melatonin-related claims using evidence-based decision-making.

show abstract

“…The task is essential in many NLP applications, e.g. in discourse relation recognition (Chan et al, 2023), scientific document classification (Sadat and Caragea, 2022), or e-commerce product categorization (Shen et al, 2021). In practice, documents might be tagged with multiple categories that can be organized in a concept hierarchy, such as a taxonomy of a knowledge graph (Pan et al, 2017b,a), cf.…”

Section: Introductionmentioning

confidence: 99%

Instances and Labels: Hierarchy-aware Joint Supervised Contrastive Learning for Hierarchical Multi-Label Text Classification

Yu,

He,

Basulto

et al. 2023

Findings of the Association for Computational Linguistics: EMNLP 2023

View full text Add to dashboard Cite

Hierarchical multi-label text classification (HMTC) aims at utilizing a label hierarchy in multi-label classification. Recent approaches to HMTC deal with the problem of imposing an overconstrained premise on the output space by using contrastive learning on generated samples in a semi-supervised manner to bring text and label embeddings closer. However, the generation of samples tends to introduce noise as it ignores the correlation between similar samples in the same batch. One solution to this issue is supervised contrastive learning, but it remains an underexplored topic in HMTC due to its complex structured labels. To overcome this challenge, we propose HJCL, a Hierarchy-aware Joint Supervised Contrastive Learning method that bridges the gap between supervised contrastive learning and HMTC. Specifically, we employ both instance-wise and label-wise contrastive learning techniques and carefully construct batches to fulfill the contrastive learning objective. Extensive experiments on four multi-path HMTC datasets demonstrate that HJCL achieves promising results and the effectiveness of Contrastive Learning on HMTC. Code and data are available at https://github.com/simonucl/HJCL.

show abstract

SciNLI: A Corpus for Natural Language Inference on Scientific Text

Cited by 9 publications

References 31 publications

Characterizing and Verifying Scientific Claims: Qualitative Causal Structure is All You Need

Characterizing and Verifying Scientific Claims: Qualitative Causal Structure is All You Need

MelAnalyze: Fact-Checking Melatonin claims using Large Language Models and Natural Language Inference

Instances and Labels: Hierarchy-aware Joint Supervised Contrastive Learning for Hierarchical Multi-Label Text Classification

Contact Info

Product

Resources

About