Automating hate speech or inappropriate text detection in social media and other internet platforms is gaining a lot of interest and becoming a valuable research topic for both industry and academia in recent years. It is more important for applications to identify the disruptive contents, understand sentiment analysis, identify cyber bullying, detect flames, threats, hatred towards people or particular communities or groups etc. Text classification is a very challenging task due to the nature and complexities with languages, especially its context, micro words, emojis, typo error and sarcasm present in the text. In this paper, we have proposed a model with a novel approach for generating hybrid features for an effective feature representation to classify hate speech. We have combined features learned from deep learning methods with the semantic features like word n-grams and tweets specific syntactic features to form hybrid feature sets. We have also improvised preprocessing steps to reduce the number of missing embeddings to increase the vocabulary for efficient feature learning. We have experimented with the various neural networks for feature learning and machine learning models with hybrid features for classification. Our work delivers hybrid features and appropriate preprocessing techniques for an efficient classification of the standard dataset of 16k annotated hate speech tweets. The combination of Long Short Term Memory (LSTM) trained on Random Embeddings for deep learning features extraction and Logistic Regression (LR) as a classifier with the hybrid features is found to be the best model and it outperforms the state of the art reported in the literature.Resumen La automatización del discurso de odio o la detección de texto inapropiado en las redes sociales y otras plataformas de Internet está ganando mucho interés y se está convirtiendo en un tema de investigación valioso tanto para la industria como para el mundo académico en los últimos años. Es más importante que las aplicaciones identifiquen los contenidos disruptivos, comprendan el análisis de sentimientos, identifiquen el acoso cibernético, detecten llamas, amenazas, odio hacia personas o comunidades o grupos en particular, etc. La clasificación de textos es una tarea muy desafiante debido a la naturaleza y la complejidad de los idiomas , especialmente su contexto, micropalabras, emojis, error tipográfico y sarcasmo presentes en el texto. En este artículo, hemos propuesto un modelo con un enfoque novedoso para generar características híbridas para una representación de características efectiva para clasificar el discurso de odio. Hemos combinado características aprendidas de métodos de aprendizaje profundo con características semánticas como n-gramas de palabras y características sintácticas específicas de tweets para formar conjuntos de características híbridas. También hemos improvisado pasos de preprocesamiento para reducir la cantidad de incrustaciones faltantes para aumentar el vocabulario para un aprendizaje eficiente de funciones. Hemos e...
Wireless sensor networks essentially consist of an amalgam of autonomous sensor nodes deployed in an ad-hoc manner to collect information about the surroundings. These sensor nodes usually dispense the collected information to data sink nodes through information routing techniques. Predictive Node Expiration Based Energy -aware Source Routing (PNEB ESR) protocol is an energy aware routing protocol which attempts to optimize the overall energy efficiency of the sensor network and ensures the sensed information reaches a data sink through a reliable path. In this protocol, the Energy Depletion Rate of a node is calculated at predetermined energy levels and is used to predict when a node will outlive its usefulness. This is used in the prediction of route expirations which will provide timely insights to the nodes on when to seek for alternative routes to data sinks. Simulations show that the PNEB ESR protocol is successful in making reliable route selections and theoretically minimizes the number of control packets flowing in the network. This leads to considerable reduction in processing and communication costs and hence minimizes overall energy consumption.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.