Do Models of Mental Health Based on Social Media Data Generalize?

Harrigian, Keith; Aguirre, Carlos; Dredze, Mark

doi:10.18653/v1/2020.findings-emnlp.337

Cited by 37 publications

(44 citation statements)

References 69 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The clinically-annotated datasets that do exist are either proprietary or do not provide a clear mechanism for inquiring about availability. The dearth of large, shareable datasets based on actual clinical diagnoses and medical ground truth is problematic given recent research that calls into question the validity of proxy-based mental health annotations (Ernala et al, 2019;Harrigian et al, 2020). By leveraging privacypreserving technology (e.g.…”

Section: Discussionmentioning

confidence: 99%

“…Mental Health Models. We create mental health models for these datasets based on recent work (Harrigian et al, 2020a;. Following standard pre-processing procedures, we filter numeric values, username mentions, retweets and urls from raw tweet text.…”

Section: Methodsmentioning

confidence: 99%

“…For model features, we considered TF-IDF vector representations, mean-pooled 200 dimensional Twitter GloVe embeddings , Linguistic Inquiry Word Count (LIWC) representations , and features based on topic distributions learned via LDA. We train ℓ 2 -regularized logistic regression models on both datasets and follow hyper-parameter tuning procedures from Harrigian et al (2020a); .…”

Section: Methodsmentioning

confidence: 99%

“…Others have used qualitative studies to analyze behaviors and performance of machine learning models in general . Previous work has analyzed representative sentences (Ettinger, 2020), hashtags (Sykora et al, 2020), performed a thematic analysis by using the Linguistic Inquiry and Word Count dictionary or trained topic models (Harrigian et al, 2020a;.…”

Section: Introductionmentioning

confidence: 99%

“…We base our analysis on datasets from previous work using Twitter. We train simple text-based models based on previous work on these datasets (Harrigian et al, 2020a;. We use a labeled topic model to characterize what content indicates depression and how this content varies by demographic group.…”

Section: Introductionmentioning

confidence: 99%

See 4 more Smart Citations

Proceedings of the Seventh Workshop on Computational Linguistics and Clinical Psychology: Improving Access

2021

View full text Add to dashboard Cite

Recently, research on mental health conditions using public online data, including Reddit, has surged in NLP and health research but has not reported user characteristics, which are important to judge generalisability of findings. This paper shows how existing NLP methods can yield information on clinical, demographic, and identity characteristics of almost 20K Reddit users who self-report a bipolar disorder diagnosis. This population consists of slightly more feminine-than masculinegendered mainly young or middle-aged USbased adults who often report additional mental health diagnoses, which is compared with general Reddit statistics and epidemiological studies. Additionally, this paper carefully evaluates all methods and discusses ethical issues. ReferencesWasim Ahmed, Peter A. Bath, and Gianluca Demartini.2017. Using Twitter as a data source: an overview of ethical, legal and methodological challenges. In Kandy Woodfield, editor, The Ethics of Online Research, pages 79-107. Emerald Books.

show abstract

Section: Discussionmentioning

confidence: 99%

Section: Methodsmentioning

confidence: 99%

Section: Methodsmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Proceedings of the Seventh Workshop on Computational Linguistics and Clinical Psychology: Improving Access

2021

View full text Add to dashboard Cite

show abstract

A comprehensive survey on online social networks security and privacy issues: Threats, machine learning‐based solutions, and open challenges

Bhattacharya

Roy

Chattopadhyay

et al. 2022

Security and Privacy

View full text Add to dashboard Cite

Over the past few years, online social networks (OSNs) have become an inseparable part of people's daily lives. Instead of being passive readers, people are now enjoying their role as content contributors. OSN has permitted its users to share their information including the multimedia content. OSN users can express themselves in virtual communities by providing their opinions and interacting with others. As a consequence, the privacy and security threats in OSNs have emerged as a major concern to the research and business world. In the recent past, a number of survey works have been conducted to discuss different security and privacy threats in OSNs. However, till date, no survey work has been conducted that aims to classify and analyze various machine learning (ML)‐based solutions adapted for the security defense of OSNs. In this survey article, we present a detailed taxonomy with a classification of various works done on various security attacks in OSNs. We then review and summarize the existing state of art survey works on OSN security, and indicate the merits and limitations of these survey works. Next, we review all recent works that aim to provide ML‐based solutions toward defense of security attacks on OSNs. Finally, we discuss the future road‐map on OSN security and provide a comprehensive analysis on the open research issues with feasible measurements and possible solutions.

show abstract