We are trying to find source code comments that help programmers understand a nontrivial part of source code. One of such examples would be explaining to assign a zero as a way to "clear" a buffer. Such comments are invaluable to programmers and identifying them correctly would be of great help. Toward this goal, we developed a method to discover explanatory code comments in a source code. We first propose eleven distinct categories of code comments. We then developed a decision-tree based classifier that can identify explanatory comments with 60% precision and 80% recall. We analyzed 2,000 GitHub projects that are written in two languages: Java and Python. This task is novel in that it focuses on a microscopic comment ("local comment") within a method or function, in contrast to the prior efforts that focused on API-or methodlevel comments. We also investigated how different category of comments is used in different projects. Our key finding is that there are two dominant types of comments: preconditional and postconditional. Our findings also suggest that many English code comments have a certain grammatical structure that are consistent across different projects.
We propose the Concept Keyword Term Frequency/Inverse Document Frequency (ckTF/IDF) method as a novel technique to efficiency mine concept keywords from identifiers in large software projects. ckTF/IDF is suitable for mining concept keywords, since the ckTF/IDF is more lightweight than the TF/IDF method, and the ckTF/IDF's heuristics is tuned for identifiers in programs.We then experimentally apply the ckTF/IDF to our educational operating system udos, consisting of around 5,000 lines in C code, which produced promising results; the udos's source code was processed in 1.4 seconds with an accuracy of around 57%. This preliminary result suggests that our approach is useful for mining concept keywords from identifiers, although we need more research and experience.
We propose the Concept Keyword Term Frequency/Inverse Document Frequency (ckTF/IDF) method as a novel technique to efficiency mine concept keywords from identifiers in large software projects. ckTF/IDF is suitable for mining concept keywords, since the ckTF/IDF is more lightweight than the TF/IDF method, and the ckTF/IDF's heuristics is tuned for identifiers in programs.We then experimentally apply the ckTF/IDF to our educational operating system udos, consisting of around 5,000 lines in C code, which produced promising results; the udos's source code was processed in 1.4 seconds with an accuracy of around 57%. This preliminary result suggests that our approach is useful for mining concept keywords from identifiers, although we need more research and experience.
A new color code which has high robustness is proposed, called Robust Index Code (RIC for short). RIC can be used on digital images and link them with the database. There are several technologies embedding data into images such as QR Code and Digital watermark. QR Code cannot be used on digital images because it does not have robustness on digital images. Besides Digital watermark can be used on digital images, but embedded data cannot be extracted 100% on damaged images. From evaluation using our implemented RIC encoder and decoder, the encoded indexes can be extracted 100% on compressed images to 30%. We also implemented a doubt color correction algorithm for damaged images. In conclusion, RIC has the high robustness on digital images. Hence, it is able to store all the type of digital products by embedding indexes into digital images to access database, which means it makes a Superdistribution system with digital images realized. Therefore RIC has the potential for new Internet image services, since all the images encoded by RIC are possible to access original products anywhere.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.