Knowledge extraction and representation aims to identify information and to transform it into a machine-readable format. Knowledge representations support Information Retrieval tasks such as searching for single statements, documents, or metadata. Requirements specifications of complex systems such as automotive software systems are usually divided into different subsystem specifications. Nevertheless, there are semantic relations between individual documents of the separated subsystems, which have to be considered in further processes (e.g. dependencies). If requirements engineers or other developers are not aware of these relations, this can lead to inconsistencies or malfunctions of the overall system. Therefore, there is a strong need for tool support in order to detects semantic relations in a set of large natural language requirements specifications. In this work we present a knowledge extraction approach based on an explicit knowledge representation of the content of natural language requirements as a semantic relation graph. Our approach is fully automated and includes an NLP pipeline to transform unrestricted natural language requirements into a graph. We split the natural language into different parts and relate them to each other based on their semantic relation. In addition to semantic relations, other relationships can also be included in the graph. We envision to use a semantic search algorithm like spreading activation to allow users to search different semantic relations in the graph.
Trace Link Recovery tries to identify and link related existing requirements with each other to support further engineering tasks. Existing approaches are mainly based on algebraic Information Retrieval or machine-learning. Machinelearning approaches usually demand reasonably large and labeled datasets to train. Algebraic Information Retrieval approaches like distance between tf-idf scores also work on smaller datasets without training but are limited in providing explanations for trace links. In this work, we present a Trace Link Recovery approach that is based on an explicit representation of the content of requirements as a semantic relation graph and uses Spreading Activation to answer trace queries over this graph. Our approach is fully automated including an NLP pipeline to transform unrestricted natural language requirements into a graph. We evaluate our approach on five common datasets. Depending on the selected configuration, the predictive power strongly varies. With the best tested configuration, the approach achieves a mean average precision of 40% and a Lag of 50%. Even though the predictive power of our approach does not outperform state-of-the-art approaches, we think that an explicit knowledge representation is an interesting artifact to explore in Trace Link Recovery approaches to generate explanations and refine results.
Abstract-As the complexity of systems continues to rise, the use of model-driven development approaches becomes more widely applied. Still, many created models are mainly used for documentation. As such, they are not designed to be used in following stages of development, but merely as a means of improved overview and communication. In an effort to use existing UML2 activity diagrams of an industry partner (Daimler AG) as a source for automatic generation of software artifacts, we discovered, that the diagrams often contain multiple instances of the same element. These redundant instances might improve the readability of a diagram. However, they complicate further approaches such as automated model analysis or traceability to other artifacts because mostly redundant instances must be handled as one distinctive element. In this paper, we present an approach to automatically remove redundant ExecutableNodes within activity diagrams as they are used by our industry partner. The removal is implemented by merging the redundant instances to a single element and adding additional elements to maintain the original behavior of the activity. We use reachability graphs to argue that our approach preserves the behavior of the activity. Additionally, we applied the approach to a real system described by 36 activity diagrams. As a result 25 redundant instances were removed from 15 affected diagrams.
Trace Link Recovery tries to identify and link related existing requirements with each other to support further engineering tasks. Existing approaches are mainly based on algebraic Information Retrieval or machine-learning. [Question/Problem] Machine-learning approaches usually demand reasonably large and labeled datasets to train. Algebraic Information Retrieval approaches like distance between tf-idf scores also work on smaller datasets without training but are limited in considering the context of semantic statements.[Principal Ideas/Results] In this work, we revise our existing Trace Link Recovery approach that is based on an explicit representation of the content of requirements as a semantic relation graph and uses Spreading Activation to answer trace queries over this graph. The approach generates sorted candidate lists and is fully automated including an NLP pipeline to transform unrestricted natural language requirements into a graph and does not require any external knowledge bases or other resources.[Contribution] To improve the performance, we take a detailed look at five common datasets and adapt the graph structure and semantic search algorithm. Depending on the selected configuration, the predictive power strongly varies. With the best tested configuration, the approach achieves a mean average precision of 50%, a Lag of 30%, and a recall of 90%.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.