When Does Pretraining Help? Assessing Self-Supervised Learning for Law and the CaseHOLD Dataset

Zheng, Lucia; Guha, Neel; Anderson, Brandon; Henderson, Peter; Ho, Daniel E.

doi:10.48550/arxiv.2104.08671

Cited by 12 publications

(16 citation statements)

References 27 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Most of these corpora are designed to facilitate a specific task. Recently, there are attempts to provide the data for multiple task training (and with it, a pre-trained model that can be customized and fine-tuned to further downstream tasks), for example CaseHold [866], Edgar [72]. Most pre-trained models rely on transformer architectures and provide a lightweight variant, fine-tuning from larger models such as BERT or GPT, namely Legal-BERT [99] and Legal-GPT [72].…”

Section: Datasets and Legal Language Modelsmentioning

confidence: 99%

Knowledge Augmented Machine Learning with Applications in Autonomous Driving: A Survey

Wörmann¹,

Bogdoll²,

Bührle³

et al. 2022

Preprint

View full text Add to dashboard Cite

The existence of representative datasets is a prerequisite of many successful artificial intelligence and machine learning models. However, the subsequent application of these models often involves scenarios that are inadequately represented in the data used for training. The reasons for this are manifold and range from time and cost constraints to ethical considerations. As a consequence, the reliable use of these models, especially in safety-critical applications, is a huge challenge. Leveraging additional, already existing sources of knowledge is key to overcome the limitations of purely data-driven approaches, and eventually to increase the generalization capability of these models. Furthermore, predictions that conform with knowledge are crucial for making trustworthy and safe decisions even in underrepresented scenarios. This work provides an overview of existing techniques and methods in the literature that combine data-based models with existing knowledge. The identified approaches are structured according to the categories integration, extraction and conformity. Special attention is given to applications in the field of autonomous driving.Acknowledgement: The research leading to these results is funded by the German Federal Ministry for Economic Affairs and Climate Action within the project "KI Wissen -Entwicklung von Methoden für die Einbindung von Wissen in maschinelles Lernen". The authors would like to thank the consortium for the successful cooperation.

show abstract

Section: Datasets and Legal Language Modelsmentioning

confidence: 99%

Knowledge Augmented Machine Learning with Applications in Autonomous Driving: A Survey

Wörmann¹,

Bogdoll²,

Bührle³

et al. 2022

Preprint

View full text Add to dashboard Cite

show abstract

“…They removed the cases which have more than one charge in a verdict. BERT [18] RoBERTa [58] DeBERTa [59] Longformer [60] BigBird [61] LEGAL-BERT [57] CaseLaw-BERT [62] ECHR [25] 1,1000 116…”

Section: Charge Prediction Datasetsmentioning

confidence: 99%

“…Link ECHR [63] 1,1000 116 US Law [64] 7,800 328 EU Law [65] 65,000 492 Contracts [66] 80,000 62 Contracts [67] 9,414 3 Harvard Law case [62] 52,800 86 CaseHOLD [62] CaseLaw-BERT [62] Harvard Law case […”

Section: Charge Prediction Datasetsmentioning

confidence: 99%

“…For example, LEGAL-BERT 8 [57], a new family of BERT models with 12 GB of English legal training corpora, consists of LEGAL-BERT-FP version (adapting standard BERT by additional pretraining on legal domain corpora) and LEGAL-BERT-SC version ( pretraining BERT from scratch on legal domain corpora). CaseLaw-BERT [62] is also the LEGAL-BERT-SC model but uses the case law corpus and custom domain-specific vocabulary. LegalDB is a DistillBERT model pre-trained on English legal-specific training corpora too.…”

Section: Pre-trained Language Modelmentioning

confidence: 99%

See 1 more Smart Citation

A Survey on Legal Judgment Prediction: Datasets, Metrics, Models and Challenges

Cui¹,

Shen²,

Nie³

et al. 2022

Preprint

View full text Add to dashboard Cite

Legal judgment prediction (LJP) applies Natural Language Processing (NLP) techniques to predict judgment results based on fact descriptions automatically. Recently, large-scale public datasets and advances in NLP research have led to increasing interest in LJP. Despite a clear gap between machine and human performance, impressive results have been achieved in various benchmark datasets. In this paper, to address the current lack of comprehensive survey of existing LJP tasks, datasets, models and evaluations, (1) we analyze 31 LJP datasets in 6 languages, present their construction process and define a classification method of LJP with 3 different attributes;(2) we summarize 14 evaluation metrics under four categories for different outputs of LJP tasks; (3) we review 12 legal-domain pretrained models in 3 languages and highlight 3 major research directions for LJP; (4) we show the state-of-art results for 8 representative datasets from different court cases and discuss the open challenges. This paper can provide up-to-date and comprehensive reviews to help readers understand the status of LJP. We hope to facilitate both NLP researchers and legal professionals for further joint efforts in this problem.

show abstract

“…We leave an exploration of the effects of domain-specific pretraining (e.g. using [44]) in this task for future work.…”

Section: Pre-training Vs Training From Scratchmentioning

confidence: 99%

Context-aware legal citation recommendation using deep learning

Huang

Low

Teng

et al. 2021

Proceedings of the Eighteenth International Conference on Artificial Intelligence and Law

Self Cite

View full text Add to dashboard Cite

Lawyers and judges spend a large amount of time researching the proper legal authority to cite while drafting decisions. In this paper, we develop a citation recommendation tool that can help improve efficiency in the process of opinion drafting. We train four types of machine learning models, including a citation-list based method (collaborative filtering) and three context-based methods (text similarity, BiLSTM and RoBERTa classifiers). Our experiments show that leveraging local textual context improves recommendation, and that deep neural models achieve decent performance. We show that non-deep text-based methods benefit from access to structured case metadata, but deep models only benefit from such access when predicting from context of insufficient length. We also find that, even after extensive training, RoBERTa does not outperform a recurrent neural model, despite its benefits of pretraining. Our behavior analysis of the RoBERTa model further shows that predictive performance is stable across time and citation classes.

show abstract

When Does Pretraining Help? Assessing Self-Supervised Learning for Law and the CaseHOLD Dataset

Cited by 12 publications

References 27 publications

Knowledge Augmented Machine Learning with Applications in Autonomous Driving: A Survey

Knowledge Augmented Machine Learning with Applications in Autonomous Driving: A Survey

A Survey on Legal Judgment Prediction: Datasets, Metrics, Models and Challenges

Context-aware legal citation recommendation using deep learning

Contact Info

Product

Resources

About