Proceedings of the Seventh Workshop on Computational Linguistics and Clinical Psychology: Improving Access 2021
DOI: 10.18653/v1/2021.clpsych-1.8
|View full text |Cite
|
Sign up to set email alerts
|

Determining a Person’s Suicide Risk by Voting on the Short-Term History of Tweets for the CLPsych 2021 Shared Task

Abstract: In this shared task, we accept the challenge of constructing models to identify Twitter users who attempted suicide based on their tweets 30 and 182 days before the adverse event's occurrence. We explore multiple machine learning and deep learning methods to identify a person's suicide risk based on the short-term history of their tweets. Taking the real-life applicability of the model into account, we make the design choice of classifying on the tweet level. By voting the tweet-level suicide risk scores throu… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
5
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
2
2
1

Relationship

1
4

Authors

Journals

citations
Cited by 7 publications
(5 citation statements)
references
References 17 publications
(10 reference statements)
0
5
0
Order By: Relevance
“…To evaluate the implications of the emotional, linguistic and cognitive facets presented in the text, many have used Linguistic Inquiry Word Count (LIWC) (Pennebaker et al, 2015). For most of the research where manually engineered features were used, the Support Vector Machine (SVM) algorithm (Cortes and Vapnik, 1995) (Bayram and Benhiba, 2021) and a Bayesian model (Gamoran et al, 2021) while Matero et al (2019) used RNN-based architectures and Mohammadi et al (2019) used a fusion approach where RNN-based architectures were combined with CNN and SVM models to produce the best results at CLPsych 2019.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…To evaluate the implications of the emotional, linguistic and cognitive facets presented in the text, many have used Linguistic Inquiry Word Count (LIWC) (Pennebaker et al, 2015). For most of the research where manually engineered features were used, the Support Vector Machine (SVM) algorithm (Cortes and Vapnik, 1995) (Bayram and Benhiba, 2021) and a Bayesian model (Gamoran et al, 2021) while Matero et al (2019) used RNN-based architectures and Mohammadi et al (2019) used a fusion approach where RNN-based architectures were combined with CNN and SVM models to produce the best results at CLPsych 2019.…”
Section: Related Workmentioning
confidence: 99%
“…During CLPsych 2019 (Zirikly et al, 2019 ) and CLPsych 2021 (MacAvaney et al, 2021 ) shared tasks, the participants produced results using either traditional machine learning or deep learning algorithms where logistic regression, SVM, CNN, and RNN based architectures were widely used. Manually engineered features were used to produce the best results in CLPsych 2021 with a weighted ensemble approach (Bayram and Benhiba, 2021 ) and a Bayesian model (Gamoran et al, 2021 ) while Matero et al ( 2019 ) used RNN-based architectures and Mohammadi et al ( 2019 ) used a fusion approach where RNN-based architectures were combined with CNN and SVM models to produce the best results at CLPsych 2019.…”
Section: Related Workmentioning
confidence: 99%
“…The goal of the tasks from the previous year was to assess the suicide risk of a user from posts 30 days or 6 months prior to a suicide attempt. The best-performing models used approaches such as weighted ensemble of different machine learning classifiers (LR, Naive Bayes classifiers, linear SVM) (Bayram and Benhiba, 2021), LSTM architecture with topic modelling and dictionary-based features (Gollapalli et al, 2021) and Bayesian modelling of features from Linguistic Inquiry and Word Count (LIWC) (Pennebaker et al, 2001), behavioural information or other features derived from already available or custom dictionaries (Gamoran et al, 2021).…”
Section: Related Workmentioning
confidence: 99%
“…Feature Extraction: For the first submission, we use two types of features. The first feature, n-grams, is selected due to their success in previous suicide risk detection research (Bayram and Benhiba, 2021;Pestian et al, 2020). Our n-gram features consist of unigrams and bigrams (n ∈ {1, 2}).…”
Section: Task Bmentioning
confidence: 99%
“…In Task-A, we combine a seq2seq autoencoder and machine learning (ML) models to capture moments of change in a user's timeline. Meanwhile, in Task-B, we were partially influenced by the 2021 CLPsych results, which showed that merging longterm posts of a user could capture long-term suicidal ideation (Bayram and Benhiba, 2021;Macavaney et al, 2021). We used the post-level features extracted in Task-A to compute user-level emotionbandwidth features and concatenated them with statistical n-gram features to detect suicidal risk levels.…”
Section: Introductionmentioning
confidence: 99%