Context] Semantic legal metadata provides information that helps with understanding and interpreting the meaning of legal provisions. Such metadata is important for the systematic analysis of legal requirements.[Objectives] Our work is motivated by two observations: (1) The existing requirements engineering (RE) literature does not provide a harmonized view on the semantic metadata types that are useful for legal requirements analysis.(2) Automated support for the extraction of semantic legal metadata is scarce, and further does not exploit the full potential of natural language processing (NLP). Our objective is to take steps toward addressing these limitations.[Methods] We review and reconcile the semantic legal metadata types proposed in RE. Subsequently, we conduct a qualitative study aimed at investigating how the identified metadata types can be extracted automatically.[Results and Conclusions] We propose (1) a harmonized conceptual model for the semantic metadata types pertinent to legal requirements analysis, and (2) automated extraction rules for these metadata types based on NLP. We evaluate the extraction rules through a case study. Our results indicate that the rules generate metadata annotations with high accuracy.
Searching legal texts for relevant information is a complex and expensive activity. The search solutions offered by present-day legal portals are targeted primarily at legal professionals. These solutions are not adequate for requirements analysts whose objective is to extract domain knowledge including stakeholders, rights and duties, and business processes that are relevant to legal requirements. Semantic Web technologies now enable smart search capabilities and can be exploited to help requirements analysts in elaborating legal requirements. In our previous work, we developed an automated framework for extracting semantic metadata from legal texts. In this paper, we investigate the use of our metadata extraction framework as an enabler for smart legal search with a focus on requirements engineering activities. We report on our industrial experience helping the Government of Luxembourg provide an advanced search facility over Luxembourg's Income Tax Law. The experience shows that semantic legal metadata can be successfully exploited for answering requirements engineering-related legal queries. Our results also suggest that our conceptualization of semantic legal metadata can be further improved with new information elements and relations.
Software systems are increasingly subject to regulatory compliance. Extracting compliance requirements from regulations is challenging. Ideally, locating compliance-related information in a regulation requires a joint effort from requirements engineers and legal experts, whose availability is limited. However, regulations are typically long documents spanning hundreds of pages, containing legal jargon, applying complicated natural language structures, and including crossreferences, thus making their analysis effort-intensive. In this paper, we propose an automated questionanswering (QA) approach that assists requirements engineers in finding the legal text passages relevant to compliance requirements. Our approach utilizes largescale language models fine-tuned for QA, including BERT and three variants. We evaluate our approach on 107 question-answer pairs, manually curated by subject-matter experts, for four different European regulatory documents. Among these documents is the general data protection regulation (GDPR) -a major source for privacy-related requirements. Our empirical results show that, in ≈94% of the cases, our approach finds the text passage containing the answer to a given question among the top five passages that our approach marks as most relevant. Further, our approach successfully demarcates, in the selected passage, the right answer with an average accuracy of ≈91%.
Semantic legal metadata provides information that helps with understanding and interpreting legal provisions. Such metadata is therefore important for the systematic analysis of legal requirements. However, manually enhancing a large legal corpus with semantic metadata is prohibitively expensive. Our work is motivated by two observations: (1) the existing requirements engineering (RE) literature does not provide a harmonized view on the semantic metadata types that are useful for legal requirements analysis; (2) automated support for the extraction of semantic legal metadata is scarce, and it does not exploit the full potential of artificial intelligence technologies, notably natural language processing (NLP) and machine learning (ML). Our objective is to take steps toward overcoming these limitations. To do so, we review and reconcile the semantic legal metadata types proposed in the RE literature. Subsequently, we devise an automated extraction approach for the identified metadata types using NLP and ML. We evaluate our approach through two case studies over the Luxembourgish legislation. Our results indicate a high accuracy in the generation of metadata annotations. In particular, in the two case studies, we were able to obtain precision scores of 97.2% and 82.4%, and recall scores of 94.9% and 92.4%.
Recent deep learning approaches to Natural Language Generation mostly rely on sequence-to-sequence models. In these approaches, the input is treated as a sequence whereas in most cases, input to generation usually is either a tree or a graph. In this paper, we describe an experiment showing how enriching a sequential input with structural information improves results and help support the generation of paraphrases.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.