Abstract-The problem of detecting misinformation and fake content on social media is gaining importance with the increase in popularity of these social media platforms. Researchers have addressed this content analysis problem using machine learning tools with innovations in feature engineering as well as algorithm design. However, most of the machine learning approaches use a conventional classification setting, involving training a classifier on a set of features. In this work, we propose a fusion of a pairwise ranking approach and a classification system in detecting tweets with misinformation that include multimedia content. Pairwise ranking allows comparison between two objects and returns a preference score for the first object in the pair in comparison to the second object. We design a ranking system to determine the legitimacy score for a tweet with reference to another tweet from the same topic of discussion (as hashtagged on Twitter), thereby allowing a contextual comparison. Finally, we incorporate the ranking system outputs within the classification system. The proposed fusion obtains an Unweighted Average Recall (UAR) of 83.5% in classifying misinforming tweets against genuine tweets, a significant improvement over a classification only baseline system (UAR: 80.1%).
Convolutional Neural Networks (CNNs) have revolutionized performances in several machine learning tasks such as image classification, object tracking, and keyword spotting. However, given that they contain a large number of parameters, their direct applicability into low resource tasks is not straightforward. In this work, we experiment with an application of CNN models to gastrointestinal landmark classification with only a few thousands of training samples through transfer learning. As in a standard transfer learning approach, we train CNNs on a large external corpus, followed by representation extraction for the medical images. Finally, a classifier is trained on these CNN representations. However, given that several variants of CNNs exist, the choice of CNN is not obvious. To address this, we develop a novel metric that can be used to predict test performances, given CNN representations on the training set. Not only we demonstrate the superiority of the CNN based transfer learning approach against an assembly of knowledge driven features, but the proposed metric also carries an 87% correlation with the test set performances as obtained using various CNN representations.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.