25 Tweets to Know You: A New Model to Predict Personality with Social Media

Arnoux, Pierre-Hadrien; Xu, Anbang; Boyette, Neil; Mahmud, Jalal; Akkiraju, Rama; Sinha, Vibha Singhal

doi:10.1609/icwsm.v11i1.14963

Cited by 60 publications

(19 citation statements)

References 10 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…, 2011). Further, Arnoux et al. (2017) studied the accuracy of prior work on big five personality and the dependence on the size of the input text, and introduced a method using word embedding and Gaussian process.…”

Section: Theoretical Backgroundmentioning

confidence: 99%

Performance prediction of basketball players using automated personality mining with twitter data

Siemon

Wessels

2022

SBM

View full text Add to dashboard Cite

PurposeThe purpose of this paper is to use Twitter data to mine personality traits of basketball players to predict their performance in the National Basketball Association (NBA).Design/methodology/approachAutomated personality mining and robotic process automation were used to gather data (player statistics and big five personality traits) of n = 185 professional basketball players. Correlation analysis and multiple linear regressions were computed to predict the performance of their NBA careers based on previous college performance and personality traits.FindingsAutomated personality mining of Tweets can be used to gather additional information about basketball players. Extraversion, agreeableness and conscientiousness correlate with basketball performance and can be used, in combination with previous game statistics, to predict future performance.Originality/valueThe study presents a novel approach to use automated personality mining of Twitter data as a predictor for future basketball performance. The contribution advances the understanding of the importance of personality for sports performance and the use of cognitive systems (automated personality mining) and the social media data for predictions. Scouts can use our findings to enhance their recruiting criteria in a multi-million dollar business, such as the NBA.

show abstract

Section: Theoretical Backgroundmentioning

confidence: 99%

Performance prediction of basketball players using automated personality mining with twitter data

Siemon

Wessels

2022

SBM

View full text Add to dashboard Cite

show abstract

“…GloVe uses a count-based model, which learns embeddings by looking at how often a word appears in the context of another word within the corpus, focusing on the co-occurrence probabilities of words within a large training corpus of documents such as Wikipedia. Studies of personality inferences that use neural word embeddings include (Kamijo et al, 2016;Arnoux et al, 2017;Majumder et al, 2017;Jayaratne and Jayatilleke, 2020). Though pre-trained neural word embeddings are widely used, they assume that a word's meaning is relatively stable and does not change across different sentences.…”

Section: Word and Document Representationsmentioning

confidence: 99%

Explainable Personality Prediction Using Answers to Open-Ended Interview Questions

2022

View full text Add to dashboard Cite

In this work, we demonstrate how textual content from answers to interview questions related to past behavior and situational judgement can be used to infer personality traits. We analyzed responses from over 58,000 job applicants who completed an online text-based interview that also included a personality questionnaire based on the HEXACO personality model to self-rate their personality. The inference model training utilizes a fine-tuned version of InterviewBERT, a pre-trained Bidirectional Encoder Representations from Transformers (BERT) model extended with a large interview answer corpus of over 3 million answers (over 330 million words). InterviewBERT is able to better contextualize interview responses based on the interview specific knowledge learnt from the answer corpus in addition to the general language knowledge already encoded in the initial pre-trained BERT. Further, the “Attention-based” learning approaches in InterviewBERT enable the development of explainable personality inference models that can address concerns of model explainability, a frequently raised issue when using machine learning models. We obtained an average correlation of r = 0.37 (p < 0.001) across the six HEXACO dimensions between the self-rated and the language-inferred trait scores with the highest correlation of r = 0.45 for Openness and the lowest of r = 0.28 for Agreeableness. We also show that the mean differences in inferred trait scores between male and female groups are similar to that reported by others using standard self-rated item inventories. Our results show the potential of using InterviewBERT to infer personality in an explainable manner using only the textual content of interview responses, making personality assessments more accessible and removing the subjective biases involved in human interviewer judgement of candidate personality.

show abstract

“…Based on the test results, we concluded that the CNN model trained using GloVe weighting could produce higher test data accuracy than the CNN model trained using random weighting or weighting without GloVe. Because GloVe word embedding had previously trained with large corpus from Wikipedia and Gigaword 5 (a collection of Englishlanguage news source networks), its vector representation brings external knowledge to our classification task [16]. The MBTI model only needs a slight update of the embedding weight value to reach the convergence point.…”

Section: Random and Glove Weighting Testmentioning

confidence: 99%

Myers-Briggs Type Indicator Personality Model Classification in English Text using Convolutional Neural Network Method

Sugihdharma

Bachtiar

2022

Jurnal Ilmu Komputer dan Informasi

View full text Add to dashboard Cite

Myers-Briggs Type Indicator (MBTI) is a personality model developed by Katharine Cooks Briggs and Isabel Briggs Myers in 1940. It displays a combination of preferences from four domains. Generally, test takers need to answer about 50 to 70 questions, and it is relatively expensive to know MBTI personality. The researcher developed a personality classification system using the Convolutional Neural Network (CNN) method and GloVe (Global Vectors for Word Representation) word embedding to solve this problem. The dataset used in this research consists of 8,675 data from the Kaggle site. The steps in this research are downloading the dataset from Kaggle, text preprocessing, GloVe weighting, classification using the CNN method, and evaluation using accuracy from the Confusion Matrix. Based on the tests carried out, using GloVe weighting can improve the model accuracy rather than random weighting. The best GloVe word dimensions depend on the metrics used to measure the model performance and the data of the classes contained in the dataset. From the CNN hyperparameter tuning test, the Adamax optimizer performs better and produces higher accuracy than the Adam optimizer. In addition, the CNN hyperparameter tuning increased model accuracy more significantly compared with the best GloVe word embedding dimensions.

show abstract

25 Tweets to Know You: A New Model to Predict Personality with Social Media

Cited by 60 publications

References 10 publications

Performance prediction of basketball players using automated personality mining with twitter data

Performance prediction of basketball players using automated personality mining with twitter data

Explainable Personality Prediction Using Answers to Open-Ended Interview Questions

Myers-Briggs Type Indicator Personality Model Classification in English Text using Convolutional Neural Network Method

Contact Info

Product

Resources

About