The interaction of technology with humans has many adverse effects. The rapid growth and outreach of the social media and the Web have led to the dissemination of questionable and untrusted content among a wider audience, which has negatively influenced their lives and judgment. Many research studies have been conducted to tackle the detection and spreading of fake news, which is misinformation that looks genuine. While the first step of such tasks would be to classify claims associated based on their credibility, the next steps would involve identifying hidden patterns in style, syntax, and content of such news claims. We propose a generalized method based on Deep Neural Networks to detect if a given claim is fake or genuine. We have used a modular approach by combining techniques from information retrieval, natural language processing, and deep learning. Our classifier comprises two main submodules. The first submodule uses the claim to retrieve relevant articles from the knowledge base which can then be used to verify the truth of the claim. It also uses word‐level features for prediction. The second submodule uses a deep neural network to learn the underlying style of fake content. Our experiments conducted on benchmark datasets show that for the given classification task we can obtain up to 82.4% accuracy by using a combination of two models; the first model was up to 72% accurate while the second model was around 81% accurate. Our detection model has the potential to automatically detect and prevent the spread of fake news, thus, limiting the caustic influence of technology in the human lives.
As human beings utilize computing technologies to mediate multiple aspects of their lives, cyberbullying has grown as an important societal challenge. Cyberbullying may lead to deep psychiatric and emotional disorders for those affected. Hence, there is an urgent need to devise automated methods for cyberbullying detection and prevention. While recent cyberbullying detection efforts have defined sophisticated text processing methods for cyberbullying detection, there are as yet few efforts that leverage visual data processing to automatically detect cyberbullying. Based on early analysis of a public, labeled cyberbullying dataset, we report that visual features complement textual features in cyberbullying detection and can help improve predictive results.
No abstract
A common step in the processing of any text is the part-of-speech tagging of the input text. In this paper, we present an approach to tackle code-mixed text from three different languages Bengali, Hindi, and Tamilapart from English. Our system uses Conditional Random Field, a sequence learning method, which is useful to capture patterns of sequences containing code switching to tag each word with accurate part-of-speech information. We have used various pre-processing and post-processing modules to improve the performance of our system. The results were satisfactory, with a highest of 75.22% accuracy in Bengali-English mixed data. The methodology that we employed in the task can be used for any resource poor language. We adapted standard learning approaches that work well with scarce data. We have also ensured that the system is portable to different platforms and languages and can be deployed for real-time analysis.
Abstract. An evaluation metric is an absolute necessity for measuring the performance of any system and complexity of any data. In this paper, we have discussed how to determine the level of complexity of code-mixed social media texts that are growing rapidly due to multilingual interference. In general, texts written in multiple languages are often hard to comprehend and analyze. At the same time, in order to meet the demands of analysis, it is also necessary to determine the complexity of a particular document or a text segment. Thus, in the present paper, we have discussed the existing metrics for determining the code-mixing complexity of a corpus, their advantages and shortcomings as well as proposed several improvements on the existing metrics. The new index better reflects the variety and complexity of a multilingual document. Also, the index can be applied to a sentence and seamlessly extended to a paragraph or an entire document. We have employed two existing code-mixed corpora to suit the requirements of our study.
Whenever human beings interact with each other, they exchange or express opinions, emotions and sentiments. These opinions can be expressed in text, speech or images. Analysis of these sentiments is one of the popular research areas of present day researchers. Sentiment analysis, also known as opinion mining tries to identify or classify these sentiments or opinions into two broad categoriespositive and negative. Much work on sentiment analysis has been done on social media conversations, blog posts, newspaper articles and various narrative texts. However, when it came to identifying emotions from scientific papers, researchers used to face difficulties due to the implicit and hidden natures of opinions or emotions. As the citation instances are considered inherently positive in emotion, popular ranking and indexing paradigms often neglect the opinion present while citing. Therefore in the present paper, we deployed a system of citation sentiment analysis to achieve three major objectives. First, we identified sentiments in the citation text and assigned a score to each of the instances. We have used a supervised classifier for this purpose. Secondly, we have proposed a new index (we shall refer to it hereafter as M-index) which takes into account both the quantitative and qualitative factors while scoring a paper. Finally, we developed a ranking of research papers based on the M-index. We have also shown the impacts of M-index on the ranking of scientific papers.
The Future Conversations workshop at CHIIR'21 looked to the future of search, recommendation, and information interaction to ask: where are the opportunities for conversational interactions? What do we need to do to get there? Furthermore, who stands to benefit? The workshop was hands-on and interactive. Rather than a series of technical talks, we solicited position statements on opportunities, problems, and solutions in conversational search in all modalities (written, spoken, or multimodal). This paper -co-authored by the organisers and participants of the workshop- summarises the submitted statements and the discussions we had during the two sessions of the workshop. Statements discussed during the workshop are available at https://bit.ly/FutureConversations2021Statements.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.