On the Use of Context for Predicting Citation Worthiness of Sentences in Scholarly Articles

Gosangi, Rakesh; Arora, Ravneet; Gheisarieha, Mohsen; Mahata, Debanjan; Zhang, Haimin

doi:10.18653/v1/2021.naacl-main.359

Cited by 3 publications

(3 citation statements)

References 20 publications

(11 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Farber et al [3] and Bonab et al [4] utilized convolutional recurrent neural networks on diverse datasets. Context-aware citation detection was introduced by Gosangi et al [5] with the ACL-cite dataset, integrating BiLSTMs and transformer-based embeddings. Wright et al [6] delved into citation worthiness extensively, incorporating domain adaptation and transfer learning techniques.…”

Section: Related Workmentioning

confidence: 99%

“…We experimented with different models trained on our dataset to establish the baselines for the task of citation-worthiness detection (RQ2). For this assessment, we used our subset with 1M entries 5 . The split contains sentences sampled over all jurisdictions.…”

Section: Experimentation and Dicussionmentioning

confidence: 99%

“…Thanking Rajiv Ratn Shah, who is partly supported by the Infosys Center for AI, the Center for Design and New Media, and the Center of Excellence in Healthcare at IIIT Delhi. We also want to thank Dr. Debanjan Mahata for motivating us and providing insights on the task [5].…”

Section: Referencesmentioning

confidence: 99%

See 2 more Smart Citations

CiteCaseLAW: Citation Worthiness Detection in Caselaw for Legal Assistive Writing

Khatri,

Sheik,

Wadhwa

et al. 2023

Frontiers in Artificial Intelligence and Applications

View full text Add to dashboard Cite

Complex legal language, filled with jargon, nuanced language semantics, and a high level of domain specificity, poses a significant challenge for automation in handling various legal tasks. In the realm of legal document composition, a pivotal component revolves around accurately referencing case laws and other sources to substantiate assertions and arguments. Understanding the legal domain and identifying appropriate citation context or cite-worthy sentences automatically is challenging. Our research is centered on the issue of citation-worthiness identification of a given sentence. This serves as the initial phase in contemporary citation recommendation systems, aimed at alleviating the effort involved in extracting a suitable array of citation contexts. To address this, we first introduce a labeled dataset comprising 178 million sentences, specifically tailored for detecting citation-worthy content within the legal domain. This dataset is curated from the Caselaw Access Project (CAP) (https://case.law/). We proceeded to assess the performance of a range of deep learning models on this novel dataset. Among the models examined, the domain-specific pre-trained model consistently demonstrated superior performance, achieving an 88% F1-score in the task of detecting citation-worthy material. To enhance our insights, we employed inputXGradient explainable AI techniques to dissect the predictions, thereby identifying the tokens that contribute to specific citation classes.

show abstract