Code smells or bad smells are an accepted approach to identify design flaws in the source code. Although it has been explored by researchers, the interpretation of programmers is rather subjective. One way to deal with this subjectivity is to use machine learning techniques. This paper provides the reader with an overview of machine learning techniques and code smells found in the literature, aiming at determining which methods and practices are used when applying machine learning for code smells identification and which machine learning techniques have been used for code smells identification. A mapping study was used to identify the techniques used for each smell. We found that the Bloaters was the main kind of smell studied, addressed by 35% of the papers. The most commonly used technique was Genetic Algorithms (GA), used by 22.22% of the papers. Regarding the smells addressed by each technique, there was a high level of redundancy, in a way that the smells are covered by a wide range of algorithms. Nevertheless, Feature Envy stood out, being targeted by 63% of the techniques. When it comes to performance, the best average was provided by Decision Tree, followed by Random Forest, Semi-supervised and Support Vector Machine Classifier techniques. 5 out of the 25 analyzed smells were not handled by any machine learning techniques. Most of them focus on several code smells and in general there is no outperforming technique, except for a few specific smells. We also found a lack of comparable results due to the heterogeneity of the data sources and of the provided results. We recommend the pursuit of further empirical studies to assess the performance of these techniques in a standardized dataset to improve the comparison reliability and replicability.
As métricas de acoplamento ajudam a identificar os elementos que possam impactar na qualidade de um software orientado a objetos. O alto grau de acoplamento pode afetar alguns atributos externos de qualidade de software, como manutenibilidade. Este artigo apresenta os resultados de um estudo com desenvolvedores de software de diferentes níveis de experiência em programação que avaliaram o grau de acoplamento de um projeto. Com base nas respostas obtidas, foram relacionadas métricas de acoplamento com o propósito de identificar pontos críticos e melhorar a qualidade e manutenibilidade de sistemas orientados a objetos. Palavras-chave: Acoplamento. Manutenção de Software. Métricas de Acoplamento.
Software analytics enables data analysis in the various software repositories in order to provide well-founded decisions for software engineers. With the intention of promoting the use of software analytics this paper answers the question: Can software analytics find out which system source files have been altered and what are the reasons for its alterations?Through techniques of topics model and the association rule, a study was carried out over source code repository of the open source software, the Jenkins, and we developed a prototype system from these techniques. With the help of a focus group formed by professionals in software development it was possible to evaluate the prototype developed. We verified that the method presented in this study allows to identify which files of the system are being altered and why, facilitating the mapping of the files of the software by subject, the planning of refactoring, builds and software stability and evolution.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.