The text-package: An R-package for analyzing and visualizing human language using natural language processing and transformers.

Kjell, Oscar; Giorgi, Salvatore; Schwartz, H. Andrew

doi:10.1037/met0000542

Cited by 11 publications

(4 citation statements)

References 89 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…We first sought to provide a visual depiction of memory content as a function depression symptoms and boredom proneness. To do so, we constructed a supervised dimension projection plot 81 for participants’ text descriptions of their recurrent IAMs. In brief, this method uses word embeddings to identify words that are significantly related to high versus low scorers on variables of interest (i.e., depression symptoms and boredom proneness).…”

Section: Methodsmentioning

confidence: 99%

Disentangling boredom from depression using the phenomenology and content of involuntary autobiographical memories

Yeung,

Danckert,

van Tilburg

et al. 2024

Sci Rep

View full text Add to dashboard Cite

Recurrent involuntary autobiographical memories (IAMs) are memories retrieved unintentionally and repetitively. We examined whether the phenomenology and content of recurrent IAMs could differentiate boredom and depression, both of which are characterized by affective dysregulation and spontaneous thought. Participants (n = 2484) described their most frequent IAM and rated its phenomenological properties (e.g., valence). Structural topic modeling, a method of unsupervised machine learning, identified coherent content within the described memories. Boredom proneness was positively correlated with depressive symptoms, and both boredom proneness and depressive symptoms were correlated with more negative recurrent IAMs. Boredom proneness predicted less vivid recurrent IAMs, whereas depressive symptoms predicted more vivid, negative, and emotionally intense ones. Memory content also diverged: topics such as relationship conflicts were positively predicted by depressive symptoms, but negatively predicted by boredom proneness. Phenomenology and content in recurrent IAMs can effectively disambiguate boredom proneness from depressive symptoms in a large sample of undergraduate students from a racially diverse university.

show abstract

Section: Methodsmentioning

confidence: 99%

Disentangling boredom from depression using the phenomenology and content of involuntary autobiographical memories

Yeung,

Danckert,

van Tilburg

et al. 2024

Sci Rep

View full text Add to dashboard Cite

show abstract

“…We will also briefly mention how to adapt the code for regression problems, that is, the prediction of continuous variables. Similar to the analyses reported by Kjell, Giorgi, and Schwartz (2023b) In our example, the data are available as .csv files and use a semicolon for separating the texts and labels. Since multiple other types of files could also be used to store the data, the format argument of the load_dataset function in the dataset package allows for specifying many commonly found file formats, such as "txt", "csv"…”

Section: Loading the Datamentioning

confidence: 99%

“…Conceptually, this model is comparable to a logistic regression model but aims at providing a model where the sum of the squared regression coefficients is small, while still providing a high goodness-of-fit to the data (via regularization). A similar approach was used, for instance, by Kjell et al (2023b). This model choice was motivated by the heuristic assumption that some entries of the most word embeddings might not be useful for predicting the harmony in life score Naturally, we could also choose a more complex machine learning model to learn the statistical relationship between the word embeddings and the target variable, such as random forests.…”

Section: Using Embeddings As Featuresmentioning

confidence: 99%

From Embeddings to Explainability: A Tutorial on Transformer-Based Text Analysis for Social and Behavioral Scientists

Debelak,

Koch,

Aßenmacher

et al. 2024

Preprint

View full text Add to dashboard Cite

Large language models and their use for text analysis have had a significant impact on psychology and the social and behavioral sciences in general. Key applications include the analysis of texts, such as social media posts, to infer psychological traits, as well as survey and interview analysis. In this tutorial paper, we demonstrate the use of the Python-based natural language processing software package transformers (and related modules from the huggingface universe) that allow for the automated classification of text inputs. In doing so, we rely on pre-trained transformer models which can be fine-tuned to a specific task and domain. The first proposed application of this model class is to use it as a feature extractor, allowing for the transformation of written text into real-valued numerical vectors (called "embeddings") that capture a text's semantic meaning. These vectors can, in turn, be used as input for a subsequent machine-learning model. The second presented application of transformer models is the end-to-end training (so-called "fine-tuning") of the model. This results in a direct prediction of the label within the same model that directly maps the text to the embeddings. While in the second case, results are usually better and training works more seamlessly, the model itself is often not directly interpretable. We showcase an alleviation of this issue via the application of post-hoc interpretability methods by calculating SHAP values and applying local interpretable model-agnostic explanations (LIME) in an attempt to explain the model's inner workings.

show abstract

“…One potential way to alleviate these limitations is to construct a method of agency measurement in language by exploiting deep learning, a technique that has recently gained significant traction in solving text classification problems. Notably, solutions based on language representation models (LRMs), such as transformers architecture, achieve results comparable with human performance, which is considered a gold standard in many text classification and evaluation tasks (e.g., Wang et al 2018;Wang et al 2022;Kjell et al 2023).…”

Section: Measuring Agency In Languagementioning

confidence: 99%

BERTAgent: The Development of a Novel Tool to Quantify Agency in Textual Data

Nikadon¹,

Suitner²,

Erseghe³

et al. 2023

Preprint

View full text Add to dashboard Cite

Agency, pertaining to goal-orientation and achievement, is a fundamental aspect of human cognition and behavior. Accordingly, detecting and quantifying linguistic representations of agency is critical for the analysis of human actions, interactions, and social dynamics. Available agency-quantifying computational tools rely on word-counting methods, which are insensitive to the semantic context in which the words are used and consequently are inaccurate in case of polysemy and negation. Additionally, some currently available tools fail to account for differences in the intensity and directionality (valence) of agency. In order to overcome these shortcomings, we present BERTAgent, a novel tool to quantify semantic agency in text. BERTAgent is a computational language model that utilizes the transformers architecture, a popular deep learning approach to natural language processing. BERTAgent was fine-tuned using carefully selected textual data that were evaluated by human coders with respect to the level of conveyed agency. In five validation studies, we demonstrate that BERTAgent outperforms previous solutions in terms of convergent and discriminant validity. Additionally, the detailed description of BERTAgent’s development procedure serves as a tutorial for the advancement of similar tools, providing a blueprint for leveraging the existing lexicographical datasets in conjunction with the deep learning techniques in order to detect and quantify other psychological constructs in textual data.🔗 https://pypi.org/project/bertagent/🔗 https://bertagent.readthedocs.io/🔗 https://github.com/cogsys-io/BERTAgent-SOM/🔗 https://github.com/cogsys-io/bertagent/

show abstract

The text-package: An R-package for analyzing and visualizing human language using natural language processing and transformers.

Cited by 11 publications

References 89 publications

Disentangling boredom from depression using the phenomenology and content of involuntary autobiographical memories

Disentangling boredom from depression using the phenomenology and content of involuntary autobiographical memories

From Embeddings to Explainability: A Tutorial on Transformer-Based Text Analysis for Social and Behavioral Scientists

BERTAgent: The Development of a Novel Tool to Quantify Agency in Textual Data

Contact Info

Product

Resources

About