Proceedings of the 2005 International Workshop on Mining Software Repositories - MSR '05 2005
DOI: 10.1145/1083142.1083151
|View full text |Cite
|
Sign up to set email alerts
|

Toward mining "concept keywords" from identifiers in large software projects

Abstract: We propose the Concept Keyword Term Frequency/Inverse Document Frequency (ckTF/IDF) method as a novel technique to efficiency mine concept keywords from identifiers in large software projects. ckTF/IDF is suitable for mining concept keywords, since the ckTF/IDF is more lightweight than the TF/IDF method, and the ckTF/IDF's heuristics is tuned for identifiers in programs.We then experimentally apply the ckTF/IDF to our educational operating system udos, consisting of around 5,000 lines in C code, which produced… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2006
2006
2021
2021

Publication Types

Select...
5
3
2

Relationship

0
10

Authors

Journals

citations
Cited by 19 publications
(8 citation statements)
references
References 1 publication
0
7
0
Order By: Relevance
“…A framework to study how the same identifiers can be trusted to represent the implementation of the same concepts was proposed in [1]. In [16] the authors proposed a method based on an improvement of the term frequency/inverted document frequency measure to mine concept keywords from identifiers in large software projects. Early work by Biggerstaff et al [4] addressed the problem of identifying human oriented concepts in a software system and associating them with their implementation instances.…”
Section: The Study Of Software Vocabulariesmentioning
confidence: 99%
“…A framework to study how the same identifiers can be trusted to represent the implementation of the same concepts was proposed in [1]. In [16] the authors proposed a method based on an improvement of the term frequency/inverted document frequency measure to mine concept keywords from identifiers in large software projects. Early work by Biggerstaff et al [4] addressed the problem of identifying human oriented concepts in a software system and associating them with their implementation instances.…”
Section: The Study Of Software Vocabulariesmentioning
confidence: 99%
“…This technique (even though simple) has been tried out in applying concept keywords and suitable e-documents have been extracted and provided to elearners according to their styles and interests, thus enhancing their learning performances [7]. For simple and straight forward concept extraction application (like cognitive dimension extraction), concept keywords have been recommended to use in computing term frequency (tf) & Inverse Document Frequency (IDF), for efficiently extracting documents [8]. Online tests are conducted to assess learner's achievements and to maintain such online items, it is a big task [9].…”
Section: Literature Surveymentioning
confidence: 99%
“…There are regular expression-based searching techniques such as UNIX grep, however, most current research focuses on modern information retrieval (IR) [24,31] and natural language [37] searching techniques. To make queries more effective at locating relevant code, approaches have been suggested to help find the concept words in software [27,37]. These approaches are complementary to our work to potentially discover a seed method set, which is used as input to our current approach.…”
Section: Search-based Exploration Approachesmentioning
confidence: 99%