Automated regulatory compliance checking requires automated extraction of requirements from regulatory textual documents and their formalization in a computer-processable rule representation. Such information extraction (IE) is a challenging task that requires complex analysis and processing of text. Natural Language Processing (NLP) aims at enabling computers to process natural language text in a human-like manner. This paper proposes a semantic, rule-based NLP approach for automated IE from construction regulatory documents. In our proposed approach, we use a set of pattern-matching-based IE rules and conflict resolution (CR) rules in IE. We use a variety of syntactic (syntax/grammar-related) and semantic (meaning/context-related) text features in the patterns of the IE and CR rules. We also propose and use phrase structure grammar (PSG)-based phrasal tags and separation and sequencing of semantic information elements to reduce number of needed patterns. We utilize an ontology to aid in the recognition of semantic text features (concepts and relations). We tested our proposed IE extraction algorithms in extracting quantitative requirements from the 2009 International Building Code and achieved 0.969 and 0.944 precision and recall, respectively.
Existing automated compliance checking (ACC) systems are limited in their automation; they rely on the use of hard-coded, proprietary rules for representing regulatory requirements, which requires major manual effort in extracting regulatory information from textual regulatory documents and coding these information into a rule format.To address this limitation, this paper proposesa new unified ACCsystem that integrates: (1) semantic natural language processing techniques and EXPRESS data based techniques to automatically extract and transform both regulatory information (in regulatory documents) and design information[in building information models (BIMs)]for automated compliance reasoning, and (2) semantic logic-based information representation so that the reasoning could be fully automated. To test the proposed system, a BIM test case was checked for compliance with Chapter 19 of the International Building Code 2009. Comparing to a manually-developed gold standard, 98.7% recall and 87.6% precision in noncompliance detection were achieved.
To fully automate regulatory compliance checking of construction projects, we need to automatically extract regulatory requirements from various construction regulatory documents, and transform these requirements into a formalized format that enables automated reasoning. To address this need, the authors propose an approach for automatically extracting information from construction regulatory textual documents and transforming them into logic clauses that could be directly used for automated reasoning. This paper focuses on presenting the proposed information transformation (ITr) methodology and the corresponding algorithms. The proposed ITr methodology utilizes a rule-based, semantic natural language processing (NLP) approach. A set of semantic mapping (SeM) rules and conflict resolution (CoR) rules are used to enable the automation of the transformation process. Several syntactic text features (captured using NLP techniques) and semantic text features (captured using an ontology) are used in the SeM and CoR rules. A bottom-up method is leveraged to handle complex sentence components. A "consume and generate" mechanism is proposed to implement the bottom-up method and execute the SeM rules. The proposed ITr algorithms were tested in transforming information instances of quantitative requirements, which were automatically extracted from the International Building Code 2009, into logic clauses. The algorithms achieved 98.2% and 99.1% precision and recall, respectively, on the testing data.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.