Incremental Latent Semantic Indexing for Automatic Traceability Link Evolution Management

Jiang, Hsin-yi; Nguyen, Tien N.; Chen, Ing-Xiang; Jaygarl, Hojun; Chang, Carl K.

doi:10.1109/ase.2008.16

Cited by 37 publications

(34 citation statements)

References 27 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…Although our work with the iLSI algorithm is closely related to the work reported in [21], there are two very important differences between how this algorithm was studied in [21] and how we use it here. Our work makes explicit the limitation that the iLSI algorithm is incapable of incorporating new information (source files and terms) as a software library evolves.…”

Section: Introductionmentioning

confidence: 99%

“…• iLSI -This algorithm was proposed by Jiang et al [21] to incrementally update the LSA model of a dynamic collection of source files and related documentation for the purpose of search-based automated traceability link recovery.…”

Section: Introductionmentioning

confidence: 99%

“…In other words, we show that the iLSI algorithm is not the best choice for incrementally updating the LSA model of an evolving software repository. Secondly, Jiang et al's [21] experimental validation consists of just two consecutive releases of the software libraries they worked with. In contrast, our experiments are based on commit-level information tracked over 10 years of commit history of the software libraries on which we have reported our results.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Comparing Incremental Latent Semantic Analysis Algorithms for Efficient Retrieval from Software Libraries for Bug Localization

Rao

Medeiros

Kak

2015

SIGSOFT Softw. Eng. Notes

View full text Add to dashboard Cite

The problem of bug localization is to identify the source files related to a bug in a software repository. Information Retrieval (IR) based approaches create an index of the source files and learn a model which is then queried with a bug for the relevant files. In spite of the advances in these tools, the current approaches do not take into consideration the dynamic nature of software repositories. With the traditional IR based approaches to bug localization, the model parameters must be recalculated for each change to a repository. In contrast, this paper presents an incremental framework to update the model parameters of the Latent Semantic Analysis (LSA) model as the data evolves. We compare two state-of-the-art incremental SVD update techniques for LSA with respect to the retrieval accuracy and the time performance. The dataset we used in our validation experiments was created from mining 10 years of version history of AspectJ and JodaTime software libraries.

show abstract

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Comparing Incremental Latent Semantic Analysis Algorithms for Efficient Retrieval from Software Libraries for Bug Localization

Rao

Medeiros

Kak

2015

SIGSOFT Softw. Eng. Notes

View full text Add to dashboard Cite

show abstract

“…One such contribution is the Incremental Latent Semantic Indexing (LSI) algorithm for search based automatic traceability link recovery proposed by Jiang et al [34]. In this paper, the authors propose an incremental approach based on LSA model to update the links between the source code files and the documentation as they both evolve.…”

Section: B Improvements In Retrieval Efficiencymentioning

confidence: 99%

An incremental update framework for efficient retrieval from software libraries for bug localization

Rao

Medeiros

Kak

2013

2013 20th Working Conference on Reverse Engineering (WCRE)

View full text Add to dashboard Cite

Abstract-Information Retrieval (IR) based bug localization techniques use a bug reports to query a software repository to retrieve relevant source files. These techniques index the source files in the software repository and train a model which is then queried for retrieval purposes. Much of the current research is focused on improving the retrieval effectiveness of these methods. However, little consideration has been given to the efficiency of such approaches for software repositories that are constantly evolving. As the software repository evolves, the index creation and model learning have to be repeated to ensure accuracy of retrieval for each new bug. In doing so, the query latency may be unreasonably high, and also, re-computing the index and the model for files that did not change is computationally redundant. We propose an incremental update framework to continuously update the index and the model using the changes made at each commit. We demonstrate that the same retrieval accuracy can be achieved but with a fraction of the time needed by current approaches. Our results are based on two basic IR modeling techniques -Vector Space Model (VSM) and Smoothed Unigram Model (SUM). The dataset we used in our validation experiments was created by tracking commit history of AspectJ and JodaTime software libraries over a span of 10 years.

show abstract

“…Some of the software engineering problems, related to concept location, which have been addressed using LSI are: traceability link recovery between source code and documentation [De Lucia et al 2007;Jiang et al 2008;Marcus et al 2005a], tracing requirements [Hayes et al 2006;Lo et al 2006] and other software artifacts [Lormans and Van Deursen 2006], identifying clones in software [Marcus and Maletic 2001;Tairas and Gray 2009], retrieving relevant artifacts in project histories [Cubranic et al 2005], measuring coupling ] and cohesion [De Lucia et al 2008;Marcus et al 2008] of classes. In these applications, the documents are formed using the source code (that is, a document can be a class, method, function, package, etc.)…”

Section: Latent Semantic Indexingmentioning

confidence: 99%

Concept location using formal concept analysis and information retrieval

Poshyvanyk

Gethers

Marcus

2012

ACM Trans. Softw. Eng. Methodol.

View full text Add to dashboard Cite

________________________________________________________________________The paper addresses the problem of concept location in source code by proposing an approach that combines Formal Concept Analysis and Information Retrieval. In the proposed approach, Latent Semantic Indexing, an advanced Information Retrieval approach, is used to map textual descriptions of software features or bug reports to relevant parts of the source code, presented as a ranked list of source code elements. Given the ranked list, the approach selects the most relevant attributes from the best ranked documents, clusters the results, and presents them as a concept lattice, generated using Formal Concept Analysis.The approach is evaluated through a large case study on concept location in the source code on six opensource systems, using several hundred features and bugs. The empirical study focuses on the analysis of various configurations of the generated concept lattices and the results indicate that our approach is effective in organizing different concepts and their relationships present in the subset of the search results. In consequence, the proposed concept location method has been shown to outperform a standalone Information Retrieval based concept location technique by reducing the number of irrelevant search results across all the systems and lattice configurations evaluated, potentially reducing the programmers' effort during software maintenance tasks involving concept location.

show abstract

Incremental Latent Semantic Indexing for Automatic Traceability Link Evolution Management

Cited by 37 publications

References 27 publications

Comparing Incremental Latent Semantic Analysis Algorithms for Efficient Retrieval from Software Libraries for Bug Localization

Comparing Incremental Latent Semantic Analysis Algorithms for Efficient Retrieval from Software Libraries for Bug Localization

An incremental update framework for efficient retrieval from software libraries for bug localization

Concept location using formal concept analysis and information retrieval

Contact Info

Product

Resources

About