<div>Code smells are structures in code that indicate the presence of maintainability issues. A significant problem with code smells is their ambiguity. They are challenging to define, and software engineers have a different understanding of what a code smell is and which code suffers from code smells.</div><div>A solution to this problem could be an AI digital assistant that understands code smells and can detect (and perhaps resolve) them. However, it is challenging to develop such an assistant as there are few usable datasets of code smells on which to train and evaluate it. Furthermore, the existing datasets suffer from issues that mostly arise from an unsystematic approach used for their construction.</div><div>Through this work, we address this issue by developing a procedure for the systematic manual annotation of code smells. We use this procedure to build a dataset of code smells. During this process, we refine the procedure and identify recommendations and pitfalls for its use. The primary contribution is the proposed annotation model and procedure and the annotators’ experience report. The dataset and supporting tool are secondary contributions of our study. Notably, our dataset includes open-source projects written in the C# programming language, while almost all manually annotated datasets contain projects written in Java.</div>
The coronavirus disease of 2019 (COVID-19) pandemic has severely crippled our globalized society. Despite the chaos, much of our civilization continued to function, thanks to contemporary information and communication technologies. In education, this situation required instructors and students to abandon the traditional face-to-face lectures and move to a fully online learning environment. Such a transition is challenging, both for the teacher tasked with creating digital educational content, and the student who needs to study in a new and isolated working environment. As educators, we have experienced these challenges when migrating our university courses to an online environment. Through this paper, we look to assist educators with building and running an online course. Before we needed to transition online, we researched and followed the best practices to establish various digital educational elements in our online classroom. We present these elements, along with guidance regarding their development and use. Next, we designed an empirical study consisting of two surveys, focus group discussions, and observations to understand the factors that influenced students' engagement with our online classroom. We used the same study to evaluate students' perceptions regarding our digital educational elements. We report the findings and define a set of recommendations from these results to help educators motivate their students and develop engaging digital educational content. Although our research is motivated by the pandemic, our findings and contributions are useful to all educators looking to establish some form of online learning. This includes developers of massive open online courses and teachers promoting blended learning in their classrooms.
No abstract
With ever-greater reliance of the developed world on information and communication technologies, constructing secure software has become a top priority. To produce secure software, security activities need to be integrated throughout the software development lifecycle. One such activity is security design analysis (SDA), which identifies security requirements as early as the software design phase. While considered an important step in software development, the general opinion of information security subject matter experts and researchers is that SDA is challenging to learn and teach. Experimental evidence provided in literature confirms this claim. To help solve this, we have developed a framework for teaching SDA by utilizing case study analysis and the hybrid flipped classroom approach. We evaluate our framework by performing a comparative analysis between a group of students who attended labs generated using our framework and a group that participated in traditional labs. Our results show that labs created using our framework achieve better learning outcomes for SDA, as opposed to the traditional labs. Secondary contributions of our article include teaching materials, such as lab descriptions and a case study of a hospital information system to be used for SDA. We outline instructions for using our framework in different contexts, including university courses and corporate training programs. By using our proposed teaching framework, with our or any other case study, we believe that both students and employees can learn the craft of SDA more effectively.
Code smells are code structures that harm the software’s quality. An obstacle to developing automatic detectors is the available datasets' limitations. Furthermore, researchers developed many solutions for Java while neglecting other programming languages. Recently, we created the code smell dataset for C# by following an annotation procedure inspired by the established annotation practices in Natural Language Processing. This paper evaluates Machine Learning (ML) code smell detection approaches on our novel dataset. We consider two feature representations to train ML models: (1) code metrics and (2) CodeT5 embeddings. This study is the first to consider the CodeT5 state-of-the-art neural source code embedding for code smell detection in C#. To prove the effectiveness of ML, we consider multiple metrics-based heuristics as alternatives. In our experiments, the best-performing approach was the ML classifier trained on code metrics (F-measure of 0.87 for Long Method and 0.91 for Large Class detection). However, the performance improvement over CodeT5 features is negligible if we consider the advantages of automatically inferring features. We showed that our model exceeds human performance and could be helpful to developers. To the best of our knowledge, this is the first study to compare the performance of automatic smell detectors against human performance.
Code smells are code structures that harm the software’s quality. An obstacle to developing automatic detectors is the available datasets' limitations. Furthermore, researchers developed many solutions for Java while neglecting other programming languages. Recently, we created the code smell dataset for C# by following an annotation procedure inspired by the established annotation practices in Natural Language Processing. This paper evaluates Machine Learning (ML) code smell detection approaches on our novel dataset. We consider two feature representations to train ML models: (1) code metrics and (2) CodeT5 embeddings. This study is the first to consider the CodeT5 state-of-the-art neural source code embedding for code smell detection in C#. To prove the effectiveness of ML, we consider multiple metrics-based heuristics as alternatives. In our experiments, the best-performing approach was the ML classifier trained on code metrics (F-measure of 0.87 for Long Method and 0.91 for Large Class detection). However, the performance improvement over CodeT5 features is negligible if we consider the advantages of automatically inferring features. We showed that our model exceeds human performance and could be helpful to developers. To the best of our knowledge, this is the first study to compare the performance of automatic smell detectors against human performance.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.