Guy De Pauw scite author profile

While social media offer great communication opportunities, they also increase the vulnerability of young people to threatening situations online. Recent studies report that cyberbullying constitutes a growing problem among youngsters. Successful prevention depends on the adequate detection of potentially harmful messages and the information overload on the Web requires intelligent systems to identify potential risks automatically. The focus of this paper is on automatic cyberbullying detection in social media text by modelling posts written by bullies, victims, and bystanders of online bullying. We describe the collection and fine-grained annotation of a cyberbullying corpus for English and Dutch and perform a series of binary classification experiments to determine the feasibility of automatic cyberbullying detection. We make use of linear support vector machines exploiting a rich feature set and investigate which information sources contribute the most for the task. Experiments on a hold-out test set reveal promising results for the detection of cyberbullying-related posts. After optimisation of the hyperparameters, the classifier yields an F1 score of 64% and 61% for English and Dutch respectively, and considerably outperforms baseline systems.

show abstract

Speech Rate in a Pluricentric Language: A Comparison Between Dutch in Belgium and the Netherlands

Verhoeven

2004

View full text Add to dashboard Cite

This paper investigates speech rate in two standard national varieties of Dutch on the basis of 160 15 mins conversations with native speakers who belong to four different regions in the Netherlands and four in the Dutch-speaking part of Belgium (Flanders). Speech rate was quantified as articulation rate and speaking rate, both expressed as the number of syllables per second (syll/s). The results show a significant effect of speakers' country of origin: subjects in the Netherlands speak 16% faster than subjects in Belgium (articulation: 5.05 vs. 4.23 syll/s, speaking: 4.23 vs. 4.00 syll/s). In addition, the independent variable sex was also found to be significant: on average, men speak 6% faster than women (articulation: 4.79 vs. 4.50 syll/s, speaking: 4.23 vs. 4.01 syll/s). The independent variable age was significant too: younger subjects speak 5% faster than older ones (articulation: 4.78 vs. 4.52 syll/s, speaking: 4.23 vs. 4.01 syll/s). The findings of this study confirm the traditional view that speech rate is determined by extralinguistic variables, but also suggest that there may be intrinsic tempo differences between language varieties.

show abstract

Online hatred of women in the Incels.me forum

Jaki

Smedt

Gwóźdź

et al. 2019

JLAC

109

View full text Add to dashboard Cite

This paper presents a study of the (now suspended) online discussion forum Incels.me and its users, involuntary celibates or incels, a virtual community of isolated men without a sexual life, who see women as the cause of their problems and often use the forum for misogynistic hate speech and other forms of incitement. Involuntary celibates have attracted media attention and concern, after a killing spree in April 2018 in Toronto, Canada. The aim of this study is to shed light on the group dynamics of the incel community, by applying mixed-methods quantitative and qualitative approaches to analyze how the users of the forum create in-group identity and how they construct major out-groups, particularly women. We investigate the vernacular used by incels, apply automatic profiling techniques to determine who they are, discuss the hate speech posted in the forum, and propose a Deep Learning system that is able to detect instances of misogyny, homophobia, and racism, with approximately 95% accuracy.

show abstract

Using a Personality-Profiling Algorithm to Investigate Political Microtargeting: Assessing the Persuasion Effects of Personality-Tailored Ads on Social Media

Zarouali

Dobber

Pauw³

et al. 2020

Communication Research

View full text Add to dashboard Cite

Political advertisers have access to increasingly sophisticated microtargeting techniques. One such technique is tailoring ads to the personality traits of citizens. Questions have been raised about the effectiveness of this political microtargeting (PMT) technique. In two experiments, we investigate the causal effects of personality-congruent political ads. In Study 1, we first assess participants’ extraversion trait by means of their own text data (i.e., by using a personality profiling algorithm), and in a second phase, target them with either a personality-congruent or incongruent political ad. In Study 2, we followed the same protocol, but instead targeted participants with emotionally-charged congruent ads, to establish whether PMT can be effective on an affect-based level. The results show evidence that citizens are more strongly persuaded by political ads that match their own personality traits. These findings feed into relevant and timely contributions to a salient academic and societal debate.

show abstract

Anterior tooth morphology and its effect on torque

Loenen¹,

Degrieck²,

Pauw³

et al. 2005

View full text Add to dashboard Cite

This study was undertaken to determine the variation in crown-root angle (CRA) of the upper incisors and canines as well as the variation in their labial contour. In addition, the influence of the variability of the labial contour and of different bracket heights on torque was evaluated. Proximal radiographs were taken of 160 extracted maxillary teeth (81 incisors and 79 canines). They were digitized and analysed with Jasc Paint Shop Pro 7TM and Mathcad 2001 Professional. The incisal edge, the centre of the cemento-enamel junction (CEJ), and the root apex were digitized to define the crown and root long axis. For all teeth the CRA was measured. At several heights of the labial surface a tangent was determined, enabling measurement of the inclination of the labial surface. The CRA had great variability, ranging from 167 to 195 degrees for the canines (mean value 183 degrees) and from 171 to 195 degrees for the incisors (average 184 degrees). The mean inclinations of the labial surfaces for the incisors varied greatly. Between 4 and 4.5 mm from the incisal edge the standard deviations (SD) were the smallest and between 2 and 4.5 mm from the incisal edge the labial surface angle differed by approximately 10 degrees. For the canines the mean inclinations of the buccal surface also varied. This angle differed by around 10 degrees between 2 and 4.5 mm from the incisal edge, but the SD were much larger than for the incisors. It can be concluded that placement of a bracket on a tooth at varying heights, still within a clinically acceptable range, results in important differences in the amount of root torque.

show abstract

Automatic Diacritic Restoration for Resource-Scarce Languages

Pauw

Wagacha

Schryver

2007

View full text Add to dashboard Cite

Abstract. The orthography of many resource-scarce languages includes diacritically marked characters. Falling outside the scope of the standard Latin encoding, these characters are often represented in digital language resources as their unmarked equivalents. This renders corpus compilation more difficult, as these languages typically do not have the benefit of large electronic dictionaries to perform diacritic restoration. This paper describes experiments with a machine learning approach that is able to automatically restore diacritics on the basis of local graphemic context. We apply the method to the African languages of Cilubà, Gĩkũyũ, Kĩkamba, Maa, Sesotho sa Leboa, Tshivenda and Yoruba and contrast it with experiments on Czech, Dutch, French, German and Romanian, as well as Vietnamese and Chinese Pinyin.

show abstract

Craniofacial structure in Marfan syndrome: A cephalometric study

Coster¹,

Pauw

Martens³

et al. 2004

American J of Med Genetics Pt A

View full text Add to dashboard Cite

Marfan syndrome (MFS) is a connective tissue disorder with autosomal dominant inheritance. Mutations in the FBN1 gene cause deficient processing of fibrillin-1, the main constituent of extracellular microfibrils, affecting tissues displaying elastic properties. Clinical manifestations are widespread and involve the skeletal, ocular, cardiovascular and pulmonary systems, skin and integumentum, and dura. A highly arched palate and retrognathia have been assigned to the symptoms with minor diagnostic specificity, although epidemiological data on prevalence are lacking yet. Twenty-six patients with MFS (n = 26) were studied for craniofacial characteristics using cephalometric measurements on lateral cranial radiographs. The purposes of this study were (1) to compare cephalometric variables of MFS group with age- and sex-matched population norms, and (2) to assess differences in palatal vault dimensions among adult MFS (n = 17) and matched controls (n = 32) by means of cephalometric measurements. Significant differences with population norms were found in the structures of the cranial base, the maxillary complex, the mandible body, and the relations of the jaws with respect to the cranial base and to each other. Palatal height and palatal length were significantly larger in MFS, and were significantly correlated to each other and to the height of the maxillo-alveolar processus. The present data disprove in part previously reported findings, possibly due to biased patient selection in these studies or demographic differences. However, a strong correlation was found between maxillary/mandibular retrognathia, long face, highly arched palate, and MFS. A combination of both intrinsic genetic factors and environmental factors is suggested as a possible explanation for specific morphogenetic aspects of the craniofacial complex in MFS.

show abstract

Current limitations in cyberbullying detection: On evaluation criteria, reproducibility, and data scarcity

Emmery

Verhoeven

Pauw

et al. 2020

Lang Resources & Evaluation

View full text Add to dashboard Cite

The detection of online cyberbullying has seen an increase in societal importance, popularity in research, and available open data. Nevertheless, while computational power and affordability of resources continue to increase, the access restrictions on high-quality data limit the applicability of state-of-the-art techniques. Consequently, much of the recent research uses small, heterogeneous datasets, without a thorough evaluation of applicability. In this paper, we further illustrate these issues, as we (i) evaluate many publicly available resources for this task and demonstrate difficulties with data collection. These predominantly yield small datasets that fail to capture the required complex social dynamics and impede direct comparison of progress. We (ii) conduct an extensive set of experiments that indicate a general lack of cross-domain generalization of classifiers trained on these sources, and openly provide this framework to replicate and extend our evaluation criteria. Finally, we (iii) present an effective crowdsourcing method: simulating real-life bullying scenarios in a lab setting generates plausible data that can be effectively used to enrich real data. This largely circumvents the restrictions on data that can be collected, and increases classifier performance. We believe these contributions can aid in improving the empirical practices of future research in the field.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Guy De Pauw

Automatic detection of cyberbullying in social media text

Speech Rate in a Pluricentric Language: A Comparison Between Dutch in Belgium and the Netherlands

Online hatred of women in the Incels.me forum

Using a Personality-Profiling Algorithm to Investigate Political Microtargeting: Assessing the Persuasion Effects of Personality-Tailored Ads on Social Media

Anterior tooth morphology and its effect on torque

Automatic Diacritic Restoration for Resource-Scarce Languages

Craniofacial structure in Marfan syndrome: A cephalometric study

Current limitations in cyberbullying detection: On evaluation criteria, reproducibility, and data scarcity

Contact Info

Product

Resources

About