Abstract-Nowadays, software developers are increasingly involved in GitHub and StackOverflow, creating a lot of valuable data in the two communities. Researchers mine the information in these software communities to understand developer behaviors, while previous work mainly focuses on mining data within a single community. In this paper, we propose a novel approach to mining developer behaviors across GitHub and StackOverflow. This approach links the accounts from two communities using a CART decision tree, leveraging the features from usernames, user behaviors and writing styles. Then, it explores cross-site developer behaviors through T-graph analysis, LDA-based topics clustering and cross-site tagging. We conducted several experiments to evaluate this approach. The results show that the precision and F-Score of our identity linkage method are higher than previous methods in software communities. Especially, we discovered that (1) active issue committers are also active question askers; (2) for most developers, the topics of their contents in GitHub are similar to that of their questions and answers in StackOverflow; (3) developers' concerns in StackOverflow shift over the time of their current participating projects in GitHub; (4) developers' concerns in GitHub are more relevant to their answers than questions and comments in StackOverflow.
With the trends of developing software on the Internet, many software crowdsourcing platforms are emerging. They attract a lot of developers to bid for crowdsourced projects and develop software systems collaboratively. In this paper, we present CrowDevBot, a task-oriented conversational bot for software crowdsourcing platform, that aims to assist online users in completing crowdsourcing-related tasks in a more natural manner. The key idea of CrowDevBot is to: (1) combine a rulebased method and an SVM-NaiveBayes-C4.5 integrated learning method to discover users' intention; (2) employ an integrated CRF (conditional random field) method with novel features to improve the performance of slot filling; and (3) leverage a software service knowledge base to unify entity names and predefine the key slots of user query. We implement CrowDevBot and integrate it into JointForce, an IT software crowdsourcing platform in China. To the best of our knowledge, this is the first time that a task-oriented conversational bot is practically used in software crowdsourcing platform(s). We evaluated our approach on real data set from JointForce. The results show that our intention detecting method achieves F1-score of 87% on the limited training data. For the slot filling, the F1-score of our integrated CRF model reaches 82%, 8% higher than that of the normal CRF model. Index Terms-Task-oriented conversational bot, software crowdsourcing platform, integrated statistical learning, user intention understanding
Abstract-A good software process can help project manager manage software development effectively and control development risks. For this reason, theory and experts' experience are concluded and put into process patterns. But it still requires human skills to search for appropriate process patterns in practice. To tackle this challenge, this paper proposes a GoalQuestion-Metric (GQM) based approach to recommending software process patterns. The essential idea of this approach is to use a GQM method to design scenario questions for software process patterns, elicit the requirement of new project by answering these questions, and then recommend the optimal matching patterns to the project. In particular, we use a Latent Dirichlet Allocation model on the scenario descriptions of software process patterns to achieve a text-topic distribution, and then apply the K-means method to do text clustering, which facilitate scenario questions design a lot. We evaluate the performance of our topic clustering method by comparing it with that of the statistics method based on TF-IDF. The evaluation results show that our method contributes a high F-score which is 11.6% higher than that of the traditional TF-IDF approach. Furthermore, the average precision of recommendation can reach 57%.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.