Nowadays, millions of people use Online Social Networks (OSNs) like Twitter, Facebook and Sina Microblog, to express opinions on current events. The widespread use of these OSNs has also led to the emergence of social bots. What is more, the existence of social bots is so powerful that some of them can turn into influential users. In this paper, we studied the automated construction technology and infiltration strategies of social bots in Sina Microblog, aiming at building friendly and influential social bots to resist malicious interpretations. Firstly, we studied the critical technology of Sina Microblog data collection, which indicates that the defense mechanism of that is vulnerable. Then, we constructed 96 social bots in Sina Microblog and researched the influence of different infiltration strategies, like different attribute settings and various types of interactions. Finally, our social bots gained 5546 followers in the 42-day infiltration period with a 100% survival rate. The results show that the infiltration strategies we proposed are effective and can help social bots escape detection of Sina Microblog defense mechanism as well. The study in this paper sounds an alarm for Sina Microblog defense mechanism and provides a valuable reference for social bots detection.
Abstract. As one of the most popular research fields in machine learning, the research on imbalanced dataset receives more and more attentions in recent years. The imbalanced problem usually occurs in when minority classes have extremely fewer samples than the others. Traditional classification algorithms have not taken the distribution of dataset into consideration, thus they fail to deal with the problem of class-imbalanced learning, and the performance of classification tends to be dominated by the majority class. SMOTE is one of the most effective over-sampling methods processing this problem, which changes the distribution of training sets by increasing the size of minority class. However, SMOTE would easily result in over-fitting on account of too many repetitive data samples. According to this issue, this paper proposes an improved method based on sparse representation theory and oversampling technique, named SROT (Sparse Representation-based Over-sampling Technique). The SROT uses a sparse dictionary to create synthetic samples directly for solving the imbalanced problem. The experiments are performed on 10 UCI datasets using C4.5 as the learning algorithm. The experimental results show that compared our algorithm with Random Over-sampling techniques, SMOTE and other methods, SROT can achieve better performance on AUC value.
Online users are typically active on multiple social media networks (SMNs), which constitute a multiplex social network. With improvements in cybersecurity awareness, users increasingly choose different usernames and provide different profiles on different SMNs. Thus, it is becoming increasingly challenging to determine whether given accounts on different SMNs belong to the same user; this can be expressed as an interlayer link prediction problem in a multiplex network. To address the challenge of predicting interlayer links , feature or structure information is leveraged. Existing methods that use network embedding techniques to address this problem focus on learning a mapping function to unify all nodes into a common latent representation space for prediction; positional relationships between unmatched nodes and their common matched neighbors (CMNs) are not utilized. Furthermore, the layers are often modeled as unweighted graphs, ignoring the strengths of the relationships between nodes. To address these limitations, we propose a framework based on multiple types of consistency between embedding vectors (MulCEV). In MulCEV, the traditional embedding-based method is applied to obtain the degree of consistency between the vectors representing the unmatched nodes, and a proposed distance consistency index based on the positions of nodes in each latent space provides additional clues for
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.