Proceedings of the Natural Legal Language Processing Workshop 2022 2022
DOI: 10.18653/v1/2022.nllp-1.22
|View full text |Cite
|
Sign up to set email alerts
|

E-NER — An Annotated Named Entity Recognition Corpus of Legal Text

Abstract: Identifying named entities such as a person, location or organization, in documents can highlight key information to readers. Training Named Entity Recognition (NER) models requires an annotated data set, which can be a time-consuming labour-intensive task. Nevertheless, there are publicly available NER data sets for general English. Recently there has been interest in developing NER for legal text. However, prior work and experimental results reported here indicate that there is a significant degradation in p… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(2 citation statements)
references
References 34 publications
(36 reference statements)
0
2
0
Order By: Relevance
“…This has given rise to a number of adaptation techniques (Daumé III, 2007;Wiese et al, 2017;Ma et al, 2019;Cooper Stickland et al, 2021;Grangier and Iter, 2022;Ludwig et al, 2022). In the pretrain-fine-tune paradigm, for pretrained models to generalize over a task in a specific domain, it is advised to fine-tune them on domain-specific datasets, which requires domain-specific annotated resources (Tsatsaronis et al, 2015;Zhu et al, 2022;Au et al, 2022;Li et al, 2021). In this paper, we test whether in-domain pretraining improves performance on a domain-specific task, but we additionally try to gain a better understanding on these models' weaknesses by examining their generalization abilities.…”
Section: Domain Adaptationmentioning
confidence: 99%
See 1 more Smart Citation
“…This has given rise to a number of adaptation techniques (Daumé III, 2007;Wiese et al, 2017;Ma et al, 2019;Cooper Stickland et al, 2021;Grangier and Iter, 2022;Ludwig et al, 2022). In the pretrain-fine-tune paradigm, for pretrained models to generalize over a task in a specific domain, it is advised to fine-tune them on domain-specific datasets, which requires domain-specific annotated resources (Tsatsaronis et al, 2015;Zhu et al, 2022;Au et al, 2022;Li et al, 2021). In this paper, we test whether in-domain pretraining improves performance on a domain-specific task, but we additionally try to gain a better understanding on these models' weaknesses by examining their generalization abilities.…”
Section: Domain Adaptationmentioning
confidence: 99%
“…In particular, Named Entity Recognition (NER) is a canonical information extraction task which consists of detecting text spans and classifying them into a predetermined set of entity types (Tjong Kim Sang and De Meulder, 2003;Lample et al, 2016a;Chiu and Nichols, 2016;Ni et al, 2017). In the past decade, numerous bench-mark datasets have enabled researchers to compare and improve the performances of NER models within specific domains such as science (Luan et al, 2018), medicine (Jin and Szolovits, 2018), law (Au et al, 2022), finance (Salinas Alvarado et al, 2015), and social media (Ushio et al, 2022); in some cases, these datasets have spanned multiple domains (Liu et al, 2020b) or languages (Tjong Kim Sang and De Meulder, 2003). Such datasets are crucial for building models capable of handling a wide range of downstream applications.…”
Section: Introductionmentioning
confidence: 99%