Proceedings of the Natural Legal Language Processing Workshop 2021 2021
DOI: 10.18653/v1/2021.nllp-1.14
|View full text |Cite
|
Sign up to set email alerts
|

SPaR.txt, a Cheap Shallow Parsing Approach for Regulatory Texts

Abstract: Automated Compliance Checking (ACC) systems aim to semantically parse building regulations to a set of rules. However, semantic parsing is known to be hard and requires large amounts of training data. The complexity of creating such training data has led to research that focuses on small sub-tasks, such as shallow parsing or the extraction of a limited subset of rules. This study introduces a shallow parsing task for which training data is relatively cheap to create, with the aim of learning a lexicon for ACC.… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
3
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(1 citation statement)
references
References 37 publications
0
1
0
Order By: Relevance
“…Recent approaches to the modeling and extraction of regulations in the construction domain vary greatly both in their choice of semantic representation and their methods for mapping text to such representations. Kruiper et al (2021) create the ScotReg corpus of Scottish building regulations, deőne a sequence labeling task that is a combination of shallow parsing (chunking) and semantic role labeling, assigning labels such as Action and Object to spans of text that are also syntactic constituents, and annotate 200 sentences using this representation to create the SPaR.txt dataset, which they use to train a standard deep learning architecture consisting of BERT embeddings, bidirectional Long Short-Term Memory (bi-LSTM) and Conditional Random Fields (CRFs). On the test portion of the dataset their models achieve precision, recall, and F1 scores around 80%.…”
Section: Nlp In the Construction Domainmentioning
confidence: 99%
“…Recent approaches to the modeling and extraction of regulations in the construction domain vary greatly both in their choice of semantic representation and their methods for mapping text to such representations. Kruiper et al (2021) create the ScotReg corpus of Scottish building regulations, deőne a sequence labeling task that is a combination of shallow parsing (chunking) and semantic role labeling, assigning labels such as Action and Object to spans of text that are also syntactic constituents, and annotate 200 sentences using this representation to create the SPaR.txt dataset, which they use to train a standard deep learning architecture consisting of BERT embeddings, bidirectional Long Short-Term Memory (bi-LSTM) and Conditional Random Fields (CRFs). On the test portion of the dataset their models achieve precision, recall, and F1 scores around 80%.…”
Section: Nlp In the Construction Domainmentioning
confidence: 99%