BackgroundIntimate partner violence (IPV) is a preventable public health issue that affects millions of people worldwide. Approximately one in four women are estimated to be or have been victims of severe violence at some point in their lives, irrespective of their age, ethnicity, and economic status. Victims often report IPV experiences on social media, and automatic detection of such reports via machine learning may enable the proactive and targeted distribution of support and/or interventions for those in need.MethodsWe collected posts from Twitter using a list of keywords related to IPV. We manually reviewed subsets of retrieved posts, and prepared annotation guidelines to categorize tweets into IPV-report or non-IPV-report. We manually annotated a random subset of the collected tweets according to the guidelines, and used them to train and evaluate multiple supervised classification models. For the best classification strategy, we examined the model errors, bias, and trustworthiness through manual and automated content analysis.ResultsWe annotated a total of 6,348 tweets, with inter-annotator agreement (IAA) of 0.86 (Cohen’s kappa) among 1,834 double-annotated tweets. The dataset had substantial class imbalance, with only 668 (∼11%) tweets representing IPV-reports. The RoBERTa model achieved the best classification performance (accuracy: 95%; IPV-report F1-score 0.76; non-IPV-report F1-score 0.97). Content analysis of the tweets revealed that the RoBERTa model sometimes misclassified as it focused on IPV-irrelevant words or symbols during decision making. Classification outcome and word importance analyses showed that our developed model is not biased toward gender or ethnicity while making classification decisions.ConclusionOur study developed an effective NLP model to identify IPV-reporting tweets automatically and in real time. The developed model can be an essential component for providing proactive social media based intervention and support for victims. It may also be used for population-level surveillance and conducting large-scale cohort studies.
Intimate partner violence (IPV) increased during the COVID-19 pandemic. Collecting actionable IPV-related data from conventional sources (e.g., medical records) was challenging during the pandemic, generating a need to obtain relevant data from non-conventional sources, such as social media. Social media, like Reddit, is a preferred medium of communication for IPV survivors to share their experiences and seek support with protected anonymity. Nevertheless, the scope of available IPV-related data on social media is rarely documented. Thus, we examined the availability of IPV-related information on Reddit and the characteristics of the reported IPV during the pandemic. Using natural language processing, we collected publicly available Reddit data from four IPV-related subreddits between January 1, 2020 and March 31, 2021. Of 4,000 collected posts, we randomly sampled 300 posts for analysis. Three individuals on the research team independently coded the data and resolved the coding discrepancies through discussions. We adopted quantitative content analysis and calculated the frequency of the identified codes. 36% of the posts ( n = 108) constituted self-reported IPV by survivors, of which 40% regarded current/ongoing IPV, and 14% contained help-seeking messages. A majority of the survivors’ posts reflected psychological aggression, followed by physical violence. Notably, 61.4% of the psychological aggression involved expressive aggression, followed by gaslighting (54.3%) and coercive control (44.3%). Survivors’ top three needs during the pandemic were hearing similar experiences, legal advice, and validating their feelings/reactions/thoughts/actions. Albeit limited, data from bystanders (survivors’ friends, family, or neighbors) were also available. Rich data reflecting IPV survivors’ lived experiences were available on Reddit. Such information will be useful for IPV surveillance, prevention, and intervention.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.