On the Effectiveness of Pre-Trained Language Models for Legal Natural Language Processing: An Empirical Study

Song, Dezhao; Gao, Sally; He, Baosheng; Schilder, Frank

doi:10.1109/access.2022.3190408

Cited by 8 publications

(7 citation statements)

References 97 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In the fields of NLP and AI and Law, researchers have applied automated techniques to classify texts of contracts (e.g. in terms of 41 categories involving general information, restrictive covenants, revenue risks [5], clause fairness [6], statutes by topics [6,7] and legal cases [8]).…”

Section: Related Workmentioning

confidence: 99%

“…The work on classifying legal cases has focused on classifying them by: argument organizational categories including fact, issue, rule/law/holding, analysis and conclusion/opinion/answer [9],judicial subtasks in connection with predicting judgments of civil law cases [10],whether the case overrules a previous one or by the type of procedural motion addressed [6],types of applicable legal claims [7],applicable civil code articles [11] and statutory elements [10],domain concepts [9],relevance to a query case [12],a winning or losing factual scenario for particular types of claims [7],factual features that strengthen or weaken a claim [13]. Items (i) and (vi) through (ix) are of special interest in empirical legal studies and, in particular, to the use of statistical methods such as ML algorithms, ‘to study legal doctrine through the use of fact-pattern analysis’ [14].…”

Section: Related Workmentioning

confidence: 99%

“…At the time of writing, costs can be found at this link: https://openai.com/pricing 6. We define the intersection as the shared factors between the model and gold-standard annotations.…”

mentioning

confidence: 99%

See 2 more Smart Citations

Empirical legal analysis simplified: reducing complexity through automatic identification and evaluation of legally relevant factors

Gray,

Savelka,

Oliver

et al. 2024

Phil. Trans. R. Soc. A.

View full text Add to dashboard Cite

This paper investigates the potential for reducing the complexity of AI and Law and empirical legal studies projects through a novel annotation methodology that relies on GPT Family Models to assist human annotators. Improving the speed, cost and quality of annotation could greatly benefit such projects. In modelling types of legal claims, researchers in the fields of empirical legal studies and AI and Law have long relied on manually annotating factors in case texts. To demonstrate our methodology, we employ cases and factors regarding whether a police officer has constitutional authority to detain a motorist on the basis of the officer’s suspicion that the motorist is trafficking drugs. Our results demonstrate how recent advances in text analytics can reduce the burden of identifying factors in large numbers of cases and improve machine learning models’ predictions of case outcomes. This article is part of the theme issue ‘A complexity science approach to law and governance’.

show abstract

Section: Related Workmentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

See 1 more Smart Citation

Empirical legal analysis simplified: reducing complexity through automatic identification and evaluation of legally relevant factors

Gray,

Savelka,

Oliver

et al. 2024

Phil. Trans. R. Soc. A.

View full text Add to dashboard Cite

show abstract

“…On one hand, the massive scale of complex text data enables or facilitates the (self-supervised) pre-training of L 3 M. On the other hand, the few-shot prompting (i.e., in-context learning) or zero-shot prompting capability of L 3 M for downstream tasks can greatly alleviate or even avoid the high labeling cost, while the flexibility of L 3 M to accommodate ambiguity and idiosyncrasies can help to meet the challenges of thoroughness and specialized knowledge. It is not surprising that with L 3 Ms such as LEGAL-BERT [6] and Lawformer [21], we are seeing new heights achieved in legal text classification and other tasks [25,15].…”

Section: Presentmentioning

confidence: 99%

Making a Computational Attorney

Dell¹,

Schilder²,

Conrad³

et al. 2023

Preprint

View full text Add to dashboard Cite

This "blue sky idea" paper outlines the opportunities and challenges in data mining and machine learning involving making a computational attorney -an intelligent software agent capable of helping human lawyers with a wide range of complex high-level legal tasks such as drafting legal briefs for the prosecution or defense in court. In particular, we discuss what a ChatGPT-like Large Legal Language Model (L 3 M) can and cannot do today, which will inspire researchers with promising short-term and long-term research objectives.

show abstract

“…However, a significant leap in language model performance is achieved when these models are underpinned by neural networks. This integration of neural networks significantly broadens the spectrum of natural language processing (NLP) tasks that a language model can tackle [1]- [3]. A neural language model exhibits versatility in handling NLP tasks, spanning from straightforward to intricate challenges.…”

Section: Introductionmentioning

confidence: 99%

Evaluating the machine learning models based on natural language processing tasks

Meeradevi,

B. J.,

B. N.

2024

IJ-AI

View full text Add to dashboard Cite

<span lang="EN-US">In the realm of natural language processing (NLP), a diverse array of language models has emerged, catering to a wide spectrum of tasks, ranging from speaker recognition and auto-correction to sentiment analysis and stock prediction. The significance of language models in enabling the execution of these NLP tasks cannot be overstated. This study proposes an approach to enhance accuracy by leveraging a hybrid language model, combining the strengths of long short-term memory (LSTM) and gated recurrent unit (GRU). LSTM excels in preserving long-term dependencies in data, while GRU's simpler gating mechanism expedites the training process. The research endeavors to evaluate four variations of this hybrid model: LSTM, GRU, <a name="_Hlk158992630"></a>bidirectional long short-term memory (Bi-LSTM), and a combination of LSTM with GRU. These models are subjected to rigorous testing on two distinct datasets: one focused on IBM stock price prediction, and the other on Jigsaw toxic comment classification (sentiment analysis). This work represents a significant stride towards democratizing NLP capabilities, ensuring that even in resource-constrained settings, NLP models can exhibit improved performance. The anticipated implications of these findings span a wide spectrum of real-world applications and hold the potential to stimulate further research in the field of NLP. </span>

show abstract

On the Effectiveness of Pre-Trained Language Models for Legal Natural Language Processing: An Empirical Study

Cited by 8 publications

References 97 publications

Empirical legal analysis simplified: reducing complexity through automatic identification and evaluation of legally relevant factors

Empirical legal analysis simplified: reducing complexity through automatic identification and evaluation of legally relevant factors

Making a Computational Attorney

Evaluating the machine learning models based on natural language processing tasks

Contact Info

Product

Resources

About