With the advent of social media, developers are increasingly using it in their software development activities. Twitter is one of the popular social mediums used by developers. A recent study by Singer et al. found that software developers use Twitter to “keep up with the fast-paced development landscape.” Unfortunately, due to the general-purpose nature of Twitter, it’s challenging for developers to use Twitter for their development activities. Our survey with 36 developers who use Twitter in their development activities highlights that developers are interested in following specialized software gurus who share relevant technical tweets.
To help developers perform this task, in this work we propose a recommendation system to identify specialized software gurus. Our approach first extracts different kinds of features that characterize a Twitter user and then employs a two-stage classification approach to generate a discriminative model, which can differentiate specialized software gurus in a particular domain from other Twitter users that generate domain-related tweets (aka domain-related Twitter users). We have investigated the effectiveness of our approach in finding specialized software gurus for four different domains (JavaScript, Android, Python, and Linux) on a dataset of 86,824 Twitter users who generate 5,517,878 tweets over 1 month. Our approach can differentiate specialized software experts from other domain-related Twitter users with an F-Measure of up to 0.820. Compared with existing Twitter domain expert recommendation approaches, our proposed approach can outperform their F-Measure by at least 7.63%.
Software developers have benefited from various sources of knowledge such as forums, question-and-answer sites, and social media platforms to help them in various tasks. Extracting software-related knowledge from different platforms involves many challenges. In this paper, we propose an approach to improve the effectiveness of knowledge extraction tasks by performing crossplatform analysis. Our approach is based on transfer representation learning and word embeddings, leveraging information extracted from a source platform which contains rich domain-related content. The information extracted is then used to solve tasks in another platform (considered as target platform) with less domain-related contents. We first build a word embeddings model as a representation learned from the source platform, and use the model to improve the performance of knowledge extraction tasks in the target platform. We experiment with Software Engineering Stack Exchange and Stack Overflow as source platforms, and two different target platforms, i.e., Twitter and YouTube. Our experiments show that our approach improves performance of existing work for the tasks of identifying software-related tweets and helpful YouTube comments.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.