NeRoSim: A System for Measuring and Interpreting Semantic Textual Similarity

Banjade, Rajendra; Niraula, Nobal B.; Maharjan, Nabin; Rus, Vasile; Ştefănescu, Dan; Lintean, Mihai; Gautam, Deepak

doi:10.18653/v1/s15-2030

Cited by 26 publications

(18 citation statements)

References 18 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…• DTSim (Banjade et al, 2016): This team builds on the NeroSim system (Banjade et al, 2015), which participated in the 2015 task with good results using a system based on manual rules blended semantic similarity features. The team explored several chunking algorithms and included new rules.…”

Section: Systems Tools and Resourcesmentioning

confidence: 99%

SemEval-2016 Task 2: Interpretable Semantic Textual Similarity

Agirre¹,

González-Agirre²,

López-Gazpio³

et al. 2016

Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016)

103

View full text Add to dashboard Cite

The final goal of Interpretable Semantic Textual Similarity (iSTS) is to build systems that explain which are the differences and commonalities between two sentences. The task adds an explanatory level on top of STS, formalized as an alignment between the chunks in the two input sentences, indicating the relation and similarity score of each alignment. The task provides train and test data on three datasets: news headlines, image captions and student answers. It attracted nine teams, totaling 20 runs. All datasets and the annotation guideline are freely available 1

show abstract

Section: Systems Tools and Resourcesmentioning

confidence: 99%

SemEval-2016 Task 2: Interpretable Semantic Textual Similarity

Agirre¹,

González-Agirre²,

López-Gazpio³

et al. 2016

Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016)

103

View full text Add to dashboard Cite

show abstract

“…We built upon a previous system called NeRoSim (Banjade et al, 2015). The limitation of their system was that the alignments were restricted to 1:1.…”

Section: Chunk Alignment Systemmentioning

confidence: 99%

DTSim at SemEval-2016 Task 2: Interpreting Similarity of Texts Based on Automated Chunking, Chunk Alignment and Semantic Relation Prediction

Banjade

Maharjan

Niraula

et al. 2016

Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016)

Self Cite

View full text Add to dashboard Cite

In this paper we describe our system (DTSim) submitted at SemEval-2016 Task 2: Interpretable Semantic Textual Similarity (iSTS). We participated in both gold chunks category (texts chunked by human experts and provided by the task organizers) and system chunks category (participants had to automatically chunk the input texts). We developed a Conditional Random Fields based chunker and applied rules blended with semantic similarity methods in order to predict chunk alignments, alignment types and similarity scores. Our system obtained F1 score up to 0.648 in predicting the chunk alignment types and scores together and was one of the top performing systems overall.

show abstract

“…For this, we compute the textual similarity or relation between extracted system use cases (taken as a query) and the regulations from the regulatory authority. We use SEMILAR API [1] to implement the similarity measurement techniques that assign each regulation a similarity score between 0 and 1 for a system use case. The regulations are then sorted on the basis of the similarity score assigned in decreasing order and top 5 regulations are extracted out from the regulations dataset.…”

Section: Automated Traceability Links Recoverymentioning

confidence: 99%

“…The Meteor evaluation metric scores regulations by aligning them to system use cases on the basis of exact, stemmed, synonymous, and paraphrase matches between words and phrases of text statements [1].…”

Section: C1: Meteormentioning

confidence: 99%

SANAYOJAN: a framework for traceability link recovery between use-cases in software requirement specification and regulatory documents

Jain

Ghaisas

Sureka

2014

Proceedings of the 3rd International Workshop on Realizing Artificial Intelligence Synergies in Software Engineering

View full text Add to dashboard Cite

User requirement specification (URS) documents written in the form of free-form natural language text contain system use-case descriptions as one of the elements in the URS. For a few application domains, some of the system use-cases in SRS define services and functionality which needs to comply with law, rules and regulations pertaining to the application domain. In this paper, we present a multi-step approach to automatically extract system use-cases from URS and construct traceability links between system-uses and appropriate regulations in the regulatory documents. We define lexicon-based, syntactic and semantic features to discriminate system use-cases from other elements in the SRS. We investigate the application of five semantic similarity methods implemented in the SEMILAR semantic similarity toolkit to compute similarity between a given system usecase with regulations in a regulatory document. We conduct a series of experiments on real-world data obtained from software projects of a large global Information Technology (IT) services company to validate the proposed approach. Experimental results demonstrate effectiveness (accuracy of 83.3% for system use-case extraction and 72% for constructing traceability links) and limitations of the proposed approach. RESEARCH MOTIVATION AND AIMSoftware applications and information systems providing services to the users and supporting business processes need to comply with the regulations related to the services and business processes supported by them [2][6][11] [10] [7][8] [9][12][13]. For example, information systems in the healthcare domain need to comply with Health Insurance Portability and Accountability Act (HIPAA 1 ) and applications in certain financial domain need to comply with the SarbanesOxley Act 2 . The need of software application compliance to regulations requires eliciting and addressing regulations related functional and non-functional requirements and also maintaining traceability of specific laws with specific elements in the software artifact due to regulatory changes [2][6][11] [10][7][8] [9] [12] [13]. Identification of elements within a software to specific regulations and maintaining the traceability links (focus of the work presented in this paper) as the system evolves is a non-trivial problem in the context of large and complex software systems. Manual process of uncovering traceability links between software artifacts and regulatory documents is not scalable, is tedious and error-prone due to the large size and complexity of the software as well as the regulations. Automatic traceability link recovery (compliance checking between software artifacts and regulatory documents) poses several technical challenges due to factors such as natural language text, terminology mismatches between software domain and legal domain and ensuring adaptability to regular amendments and revisions in regulations. Compliance checking and verification and traceability link recovery between software artifacts and regulatory documents is an area that has attrac...

show abstract

NeRoSim: A System for Measuring and Interpreting Semantic Textual Similarity

Cited by 26 publications

References 18 publications

SemEval-2016 Task 2: Interpretable Semantic Textual Similarity

SemEval-2016 Task 2: Interpretable Semantic Textual Similarity

DTSim at SemEval-2016 Task 2: Interpreting Similarity of Texts Based on Automated Chunking, Chunk Alignment and Semantic Relation Prediction

SANAYOJAN: a framework for traceability link recovery between use-cases in software requirement specification and regulatory documents

Contact Info

Product

Resources

About