Abstract-In order to support a software development team in its day-to-day operations, different data sources can be exploited. In this paper, we focus on CVS logs and communication profiles between developers provided by RFID-proximity information. We provide a novel approach for combining the data sources into a graph, and apply the pagerank algorithm for capturing interesting knowledge about resource and developer profiles. Additionally, we discuss the application in the software developer setting, and also for project management. The proposed approach is evaluated in the context of a real-world developer setting.
In this paper, we analyze the stability of user interaction within Twitter focusing on link decay prediction: for a tweet created by one user mentioning another user we study the task of predicting the decay of the corresponding interaction link over time. For this task, we employ the history of timestamped mention interactions between both users as time series features. We also tackle the problem of efficiently balancing a large dataset with a skewed class distribution, which arises naturally in our context. The proposed impurity-based supervised sampling (ISS) approach balances the data in one pass by removing trivial training data of the overrepresented class. Our approach is evaluated using the well known Twitter dump of 2009 [25]. We show, that ISS outperforms downsampling with regard to the resulting predictor performance.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.