Context: It is not uncommon for a new team member to join an existing Agile software development team, even after development has started. This new team member faces a number of challenges before they are integrated into the team and can contribute productively to team progress. Ideally, each newcomer should be supported in this transition through an effective team onboarding program, although prior evidence suggests that this is challenging for many organisations. Objective: We seek to understand how Agile teams address the challenge of team onboarding in order to inform future onboarding design. Method: We conducted an interview survey of eleven participants from eight organisations to investigate what onboarding activities are common across Agile software development teams. We also identify common goals of onboarding from a synthesis of literature. A repertory grid instrument is used to map the contributions of onboarding techniques to onboarding goals. Results: Our study reveals that a broad range of team onboarding techniques, both formal and informal, are used in practice. It also shows that particular techniques that have high contributions to a given goal or set of goals. Conclusions: In presenting a set of onboarding goals to consider and an evidence-based mechanism for selecting techniques to achieve the desired goals it is expected that this study will contribute to betterinformed onboarding design and planning. An increase in practitioner awareness of the options for supporting new team members is also an expected outcome.
—Cross-project defect prediction (CPDP) makes use of cross-project (CP) data to overcome the lack of data necessary to train well-performing software defect prediction (SDP) classifiers in the early stage of new software projects. Since the CP data (known as the source) may be different from the new project’s data (known as the target), this makes it difficult for CPDP classifiers to perform well. In particular, it is a mismatch of data distributions between source and target that creates this difficulty. Transfer learning-based CPDP classifiers are designed to minimize these distribution differences. The first Transfer learning-based CPDP classifiers treated these differences equally, thereby degrading prediction performance. To this end, recent research has proposed the Weighted Balanced Distribution Adaptation (W-BDA) method to leverage the importance of both distribution differences to improve classification performance. Although W-BDA has been shown to improve model performance in CPDP, research to date has failed to consider model performance in light of increasing target data or variances in data sampling. We provide the first investigation of when and to what extent the effect of increasing the target data and using various sampling techniques have when leveraging the importance of both distribution differences. We extend the initial W-BDA method and call this extension the W-BDA<sup>+</sup>‘ method. To evaluate the effectiveness of W-BDA<sup>+</sup>‘ for improving CPDP performance, we conduct eight experiments on 18 projects from four datasets where data sampling was performed with different sampling methods. We evaluate our method using four complementary indicators (i.e., Balanced Accuracy, AUC, F-measure and G-Measure). Our findings reveal an average improvement of 6%, 7.5%, 10% and 12% for these four indicators when W-BDA<sup>+</sup>‘ is compared to five other baseline methods (including W-BDA), for all four of the sampling methods used. Also, as the target to source ratio is increased with different sampling methods, we observe a decrease in performance for the original W-BDA, with our W-BDA<sup>+</sup> approach outperforming the original W-BDA in most cases. Our results highlight the importance of adjusting for data imbalance and having an awareness of the effect of the increasing availability of target data in CPDP scenarios.
—Cross-project defect prediction (CPDP) makes use of cross-project (CP) data to overcome the lack of data necessary to train well-performing software defect prediction (SDP) classifiers in the early stage of new software projects. Since the CP data (known as the source) may be different from the new project’s data (known as the target), this makes it difficult for CPDP classifiers to perform well. In particular, it is a mismatch of data distributions between source and target that creates this difficulty. Transfer learning-based CPDP classifiers are designed to minimize these distribution differences. The first Transfer learning-based CPDP classifiers treated these differences equally, thereby degrading prediction performance. To this end, recent research has proposed the Weighted Balanced Distribution Adaptation (W-BDA) method to leverage the importance of both distribution differences to improve classification performance. Although W-BDA has been shown to improve model performance in CPDP, research to date has failed to consider model performance in light of increasing target data or variances in data sampling. We provide the first investigation of when and to what extent the effect of increasing the target data and using various sampling techniques have when leveraging the importance of both distribution differences. We extend the initial W-BDA method and call this extension the W-BDA<sup>+</sup>‘ method. To evaluate the effectiveness of W-BDA<sup>+</sup>‘ for improving CPDP performance, we conduct eight experiments on 18 projects from four datasets where data sampling was performed with different sampling methods. We evaluate our method using four complementary indicators (i.e., Balanced Accuracy, AUC, F-measure and G-Measure). Our findings reveal an average improvement of 6%, 7.5%, 10% and 12% for these four indicators when W-BDA<sup>+</sup>‘ is compared to five other baseline methods (including W-BDA), for all four of the sampling methods used. Also, as the target to source ratio is increased with different sampling methods, we observe a decrease in performance for the original W-BDA, with our W-BDA<sup>+</sup> approach outperforming the original W-BDA in most cases. Our results highlight the importance of adjusting for data imbalance and having an awareness of the effect of the increasing availability of target data in CPDP scenarios.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.