Automatic gender detection has attracted the attention of many research fields such as forensic linguistics or marketing. Within these areas, gender detection has been approached as a classification problem and, for this reason, supervised Machine Learning algorithms such as Naïve Bayes, Logistic Regression and Support Vector Machines, among others, have been employed. The latter algorithm has exhibited a better performance on gender detection. In recent years, with the development of Deep Learning methods, various neural networks structures such as Convolutional Neural Networks have been designed for gender detection. However, Deep Learning methods have led to a loss in the interpretability of the models. In this article, we review the AI techniques applied on gender detection.
This research analyzes the sociolinguistics adequacy in advertisements. The analysis aims to observe how far advertising reects the covariation phenomena manifested in the different linguistic levels from the conjunction of certains social and contextual variables. The research aims, somehow, to make it clear that sociolinguistics adequacy in ads can increase the effectiveness of advertising communication. We present a qualitative study of a corpus formed by a total of 60 television commercials, selected taking into consideration age and gender of the speaker as the product or service publicized. On the basis of the results obtained, advertising communication reflects both the genderlect and agelect variation. Also, we have observed dierences in style and therefore advertising includes the diaphasic variation. However, we perceive the transmission of sociolinguistic stereotypes that encourage creation and difussion of certain sociocultural roles.
Within the area of Natural Language Processing, we approached the Author Profiling task as a text classification problem. Based on the author’s writing style, sociodemographic information, such as the author’s gender, age, or native language can be predicted. The exponential growth of user-generated data and the development of Machine-Learning techniques have led to significant advances in automatic gender detection. Unfortunately, gender detection models often become black-boxes in terms of interpretability. In this paper, we propose a tree-based computational model for gender detection made up of 198 features. Unlike the previous works on gender detection, we organized the features from a linguistic perspective into six categories: orthographic, morphological, lexical, syntactic, digital, and pragmatics-discursive. We implemented a Decision-Tree classifier to evaluate the performance of all feature combinations, and the experiments revealed that, on average, the classification accuracy increased up to 3.25% with the addition of feature sets. The maximum classification accuracy was reached by a three-level model that combined lexical, syntactic, and digital features. We present the most relevant features for gender detection according to the trees generated by the classifier and contextualize the significance of the computational results with the linguistic patterns defined by previous research in relation to gender.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.