2023
DOI: 10.1037/met0000542
|View full text |Cite
|
Sign up to set email alerts
|

The text-package: An R-package for analyzing and visualizing human language using natural language processing and transformers.

Abstract: The language that individuals use for expressing themselves contains rich psychological information. Recent significant advances in Natural Language Processing (NLP) and Deep Learning (DL), namely transformers, have resulted in large performance gains in tasks related to understanding natural language. However, these state-of-the-art methods have not yet been made easily accessible for psychology researchers, nor designed to be optimal for human-level analyses. This tutorial introduces text (https://r-text.org… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
3
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 11 publications
(4 citation statements)
references
References 89 publications
0
3
0
Order By: Relevance
“…We first sought to provide a visual depiction of memory content as a function depression symptoms and boredom proneness. To do so, we constructed a supervised dimension projection plot 81 for participants’ text descriptions of their recurrent IAMs. In brief, this method uses word embeddings to identify words that are significantly related to high versus low scorers on variables of interest (i.e., depression symptoms and boredom proneness).…”
Section: Methodsmentioning
confidence: 99%
“…We first sought to provide a visual depiction of memory content as a function depression symptoms and boredom proneness. To do so, we constructed a supervised dimension projection plot 81 for participants’ text descriptions of their recurrent IAMs. In brief, this method uses word embeddings to identify words that are significantly related to high versus low scorers on variables of interest (i.e., depression symptoms and boredom proneness).…”
Section: Methodsmentioning
confidence: 99%
“…We will also briefly mention how to adapt the code for regression problems, that is, the prediction of continuous variables. Similar to the analyses reported by Kjell, Giorgi, and Schwartz (2023b) In our example, the data are available as .csv files and use a semicolon for separating the texts and labels. Since multiple other types of files could also be used to store the data, the format argument of the load_dataset function in the dataset package allows for specifying many commonly found file formats, such as "txt", "csv"…”
Section: Loading the Datamentioning
confidence: 99%
“…Conceptually, this model is comparable to a logistic regression model but aims at providing a model where the sum of the squared regression coefficients is small, while still providing a high goodness-of-fit to the data (via regularization). A similar approach was used, for instance, by Kjell et al (2023b). This model choice was motivated by the heuristic assumption that some entries of the most word embeddings might not be useful for predicting the harmony in life score Naturally, we could also choose a more complex machine learning model to learn the statistical relationship between the word embeddings and the target variable, such as random forests.…”
Section: Using Embeddings As Featuresmentioning
confidence: 99%
“…One potential way to alleviate these limitations is to construct a method of agency measurement in language by exploiting deep learning, a technique that has recently gained significant traction in solving text classification problems. Notably, solutions based on language representation models (LRMs), such as transformers architecture, achieve results comparable with human performance, which is considered a gold standard in many text classification and evaluation tasks (e.g., Wang et al 2018;Wang et al 2022;Kjell et al 2023).…”
Section: Measuring Agency In Languagementioning
confidence: 99%