PARM: A Paragraph Aggregation Retrieval Model for Dense Document-to-Document Retrieval

Althammer, Sophia; Hofstätter, Sebastian; Sertkan, Mete; Verberne, Suzan; Hanbury, Allan

doi:10.1007/978-3-030-99736-6_2

Cited by 14 publications

(11 citation statements)

References 32 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Earlier techniques for legal information retrieval were mainly based on term-matching approaches (Kim and Goebel, 2017;Tran et al, 2018). Recently, a growing number of works have used neural networks to enhance retrieval performance, including word embedding models (Landthaler et al, 2016), doc2vec models (Sugathadasa et al, 2018), CNN-based models (Tran et al, 2019), and BERTbased models (Nguyen et al, 2021;Chalkidis et al, 2021;Althammer et al, 2022). To the best of our knowledge, we are the first to exploit the structure of statute law with GNNs to improve the performance of dense retrieval models.…”

Section: Related Workmentioning

confidence: 99%

Finding the Law: Enhancing Statutory Article Retrieval via Graph Neural Networks

Antoine¹,

Dijck²,

Spanakis³

2023

Preprint

View full text Add to dashboard Cite

Statutory article retrieval (SAR), the task of retrieving statute law articles relevant to a legal question, is a promising application of legal text processing. In particular, high-quality SAR systems can improve the work efficiency of legal professionals and provide basic legal assistance to citizens in need at no cost. Unlike traditional ad-hoc information retrieval, where each document is considered a complete source of information, SAR deals with texts whose full sense depends on complementary information from the topological organization of statute law. While existing works ignore these domain-specific dependencies, we propose a novel graph-augmented dense statute retriever (G-DSR) model that incorporates the structure of legislation via a graph neural network to improve dense retrieval performance. Experimental results show that our approach outperforms strong retrieval baselines on a real-world expert-annotated SAR dataset. 1

show abstract

Section: Related Workmentioning

confidence: 99%

Finding the Law: Enhancing Statutory Article Retrieval via Graph Neural Networks

Antoine¹,

Dijck²,

Spanakis³

2023

Preprint

View full text Add to dashboard Cite

show abstract

“…In the context of document-to-document retrieval where the "query" can be extremely long, Tran et al [95] first produce a summary that is further paired with lexical features in order to retrieve cases. Both PARM [96] and BERT-PLI [28] also tried to condense the documents, and performed paragraph-level modeling on top of candidates returned via BM25 or similar methods. Instead of focusing on the techniques, Shao et al [85] presented a comparative user behavior study between legal and general-domain search, and suggested that legal information retrieval is more challenging with respect to query length and number of clicks/pages, among other metrics.…”

Section: A An Overview Of Major Legal Nlp Tasksmentioning

confidence: 99%

On the Effectiveness of Pre-Trained Language Models for Legal Natural Language Processing: An Empirical Study

Song

Gao

et al. 2022

IEEE Access

View full text Add to dashboard Cite

We present the first comprehensive empirical evaluation of pre-trained language models (PLMs) for legal natural language processing (NLP) in order to examine their effectiveness in this domain.Our study covers eight representative and challenging legal datasets, ranging from 900 to 57K samples, across five NLP tasks: binary classification, multi-label classification, multiple choice question answering, summarization and information retrieval. We first run unsupervised, classical machine learning and/or non-PLM based deep learning methods on these datasets, and show that baseline systems' performance can be 4%∼35% lower than that of PLM-based methods. Next, we compare general-domain PLMs and those specifically pre-trained for the legal domain, and find that domain-specific PLMs demonstrate 1%∼5% higher performance than general-domain models, but only when the datasets are extremely close to the pretraining corpora. Finally, we evaluate six general-domain state-of-the-art systems, and show that they have limited generalizability to legal data, with performance gains from 0.1% to 1.2% over other PLM-based methods. Our experiments suggest that both general-domain and domain-specific PLM-based methods generally achieve better results than simpler methods on most tasks, with the exception of the retrieval task, where the best-performing baseline outperformed all PLM-based methods by at least 5%. Our findings can help legal NLP practitioners choose the appropriate methods for different tasks, and also shed light on potential future directions for legal NLP research.

show abstract

“…For the pool creation we use the runs from Hofstätter et al [13]. In order to have different first stage retrieval methods we use the lexical retrieval run with BM25 [24] (run 1 in Table 2) as well as the SciBERT 𝐷𝑂𝑇 run (run 2 in Table 2) which is based on dense retrieval [3,15]. As additional run we use the Ensemble which reranks BM25 Top-200 candidates using an Ensemble of BERT 𝐶𝐴𝑇 based on SciBERT, PubMedBERT-Abstract and PubMedBert-Full Text (run 7 in Table 2).…”

Section: Data and Pool Preparationmentioning

confidence: 99%

TripJudge: A Relevance Judgement Test Collection for TripClick Health Retrieval

Althammer,

Hofstätter,

Verberne

et al. 2022

Preprint

Self Cite

View full text Add to dashboard Cite

Robust test collections are crucial for Information Retrieval research. Recently there is a growing interest in evaluating retrieval systems for domain-specific retrieval tasks, however these tasks often lack a reliable test collection with human-annotated relevance assessments following the Cranfield paradigm. In the medical domain, the TripClick collection was recently proposed, which contains click log data from the Trip search engine and includes two click-based test sets. However the clicks are biased to the retrieval model used, which remains unknown, and a previous study shows that the test sets have a low judgement coverage for the Top-10 results of lexical and neural retrieval models. In this paper we present the novel, relevance judgement test collection TripJudge for TripClick health retrieval. We collect relevance judgements in an annotation campaign and ensure the quality and reusability of TripJudge by a variety of ranking methods for pool creation, by multiple judgements per query-document pair and by an at least moderate inter-annotator agreement. We compare system evaluation with TripJudge and TripClick and find that that click and judgement-based evaluation can lead to substantially different system rankings. CCS CONCEPTS• Information systems → Test collections.

show abstract

PARM: A Paragraph Aggregation Retrieval Model for Dense Document-to-Document Retrieval

Cited by 14 publications

References 32 publications

Finding the Law: Enhancing Statutory Article Retrieval via Graph Neural Networks

Finding the Law: Enhancing Statutory Article Retrieval via Graph Neural Networks

On the Effectiveness of Pre-Trained Language Models for Legal Natural Language Processing: An Empirical Study

TripJudge: A Relevance Judgement Test Collection for TripClick Health Retrieval

Contact Info

Product

Resources

About