Approaching Human Performance in Behavior Estimation in Couples Therapy Using Deep Sentence Embeddings

Tseng, Shao-Yen; Baucom, Brian R.; Georgiou, Panayiotis G.

doi:10.21437/interspeech.2017-1621

Cited by 12 publications

(11 citation statements)

References 20 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Finally, we applied a neural network on top of the embeddings to estimate actual behavior ratings. For this section we applied the framework proposed in [12]. Sessions were segmented into sentences and represented as a sequence of embeddings.…”

Section: Rating Estimation Using Neural Networkmentioning

confidence: 99%

See 1 more Smart Citation

Unsupervised online multitask learning of behavioral sentence embeddings

Tseng

Baucom

Georgiou

2019

PeerJ Computer Science

Self Cite

View full text Add to dashboard Cite

Unsupervised learning has been an attractive method for easily deriving meaningful data representations from vast amounts of unlabeled data. These representations, or embeddings, often yield superior results in many tasks, whether used directly or as features in subsequent training stages. However, the quality of the embeddings is highly dependent on the assumed knowledge in the unlabeled data and how the system extracts information without supervision. Domain portability is also very limited in unsupervised learning, often requiring re-training on other in-domain corpora to achieve robustness. In this work we present a multitask paradigm for unsupervised contextual learning of behavioral interactions which addresses unsupervised domain adaption. We introduce an online multitask objective into unsupervised learning and show that sentence embeddings generated through this process increases performance of affective tasks.

show abstract

Section: Rating Estimation Using Neural Networkmentioning

confidence: 99%

“…The final session label was obtained by training a Support Vector Regressor to map from the median of the window predictions to the session rating. For more details the reader can refer to [12].…”

Section: Rating Estimation Using Neural Networkmentioning

confidence: 99%

Unsupervised online multitask learning of behavioral sentence embeddings

Tseng

Baucom

Georgiou

2019

PeerJ Computer Science

Self Cite

View full text Add to dashboard Cite

show abstract

“…More details about the recruitment, data collection and the annotations can be found in (Christensen et al, 2004;Baucom et al, 2011). Consistent with previous work (Lee et al, 2010;Georgiou et al, 2011;Black et al, 2013;Lee et al, 2014;Tseng et al, 2017), for each participant and behavior, we take the average of the annotators' ratings as the true rating in that session. Therefore, each speaker's data sample consists of the manual transcription of their utterances and their behavior ratings in that session.…”

Section: Description Of Corpusmentioning

confidence: 99%

“…Subsequently, there have been efforts (Narayanan and Georgiou, 2013) to automate this behavior annotation (or coding) process using machine learning so that rapid and inexpensive feedback can be provided to the stakeholders. Previous work has shown that automated coding systems are effective at quantifying behaviors from speech and spoken language such as Negativity (Georgiou et al, 2011;Black et al, 2013;Chakravarthula et al, 2015a;Tseng et al, 2017), Depression (Gupta et al, 2014;Morales et al, 2018) and Empathy (Xiao et al, 2012;Gibson et al, 2016;Pérez-Rosas et al, 2017). However, there are some critical aspects of this behavior assessment process which humans can handle naturally and easily but machines still cannot, one of which is the notion of how much to observe in order to reliably assess behavior.…”

Section: Introductionmentioning

confidence: 99%

“…Such approaches, as shown in Figure 1, typically first compute a "local" behavior score within a fixed-length observation window, such as a few words or speaker turns, and then aggregate local scores from all the windows to obtain a "global" score for behavior. These are especially useful in psychological research settings (Xiao et al, 2012;Gibson et al, 2016;Tseng et al, 2017) where automated systems need to analyze "sessions", or interactions, at fine-grained scales, such as turn-level, but are evaluated at session-level. In such situations, the choice of length of the observation window is important; too short a window can result in noisy or incorrect local scores, since insufficient information is being used, and as a result, the global summary score will be inaccurate, as illustrated in the toy example in Figure 2.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

An analysis of observation length requirements for machine understanding of human behaviors from spoken language

Chakravarthula¹,

Baucom²,

Narayanan³

et al. 2019

Preprint

Self Cite

View full text Add to dashboard Cite

Machine learning-based human behavior modeling, often at the level of characterizing an entire clinical encounter such as a therapy session, has been shown to be useful across a range of domains in psychological research and practice from relationship and family studies to cancer care. Existing approaches typically first quantify the target behavior construct based on cues in an observation window, such as a fixed number of words, and then aggregate it over all the windows in that session. During this process, a sufficiently long window is employed so that adequate information is gathered to accurately estimate the construct. The link between behavior modeling and the observation length, however, has not been well studied, especially for spoken language. In this paper, we analyze the effect of observation window length on the quality of behavior quantification and present a framework for determining appropriate windows for a wide range of behaviors. Our analysis method employs two levels of evaluations: (a) extrinsic similarity between machine predictions and human expert annotations, and (b) intrinsic consistency between intra-machine and intra-human behavior relations. We apply our analysis on a dataset of real-life married couple interactions that are annotated for a large and diverse set of behavior codes and test the robustness of our findings to different machine learning models. We find that negative constructs such as blame can be accurately identified from short expressions while those pertaining to positive affect such as satisfaction tend to require slightly longer observation windows. Behaviors that describe more complex personality traits such as negotiation and avoidance are found to require very long observations and are difficult to quantify from language alone. Our findings are generally in agreement with similar work using acoustic vocal cues as well as existing literature in psychology on thin slices and human emotion perception.1 we use the term 'behavior' to refer to not just physical actions such as facial expressions, body gestures and speech but the underlying state of mind that is expressed through these actions.

show abstract

Natural language processing for mental health interventions: a systematic review and research framework

Malgaroli,

Hull,

Zech

et al. 2023

Transl Psychiatry

View full text Add to dashboard Cite

Neuropsychiatric disorders pose a high societal cost, but their treatment is hindered by lack of objective outcomes and fidelity metrics. AI technologies and specifically Natural Language Processing (NLP) have emerged as tools to study mental health interventions (MHI) at the level of their constituent conversations. However, NLP’s potential to address clinical and research challenges remains unclear. We therefore conducted a pre-registered systematic review of NLP-MHI studies using PRISMA guidelines (osf.io/s52jh) to evaluate their models, clinical applications, and to identify biases and gaps. Candidate studies (n = 19,756), including peer-reviewed AI conference manuscripts, were collected up to January 2023 through PubMed, PsycINFO, Scopus, Google Scholar, and ArXiv. A total of 102 articles were included to investigate their computational characteristics (NLP algorithms, audio features, machine learning pipelines, outcome metrics), clinical characteristics (clinical ground truths, study samples, clinical focus), and limitations. Results indicate a rapid growth of NLP MHI studies since 2019, characterized by increased sample sizes and use of large language models. Digital health platforms were the largest providers of MHI data. Ground truth for supervised learning models was based on clinician ratings (n = 31), patient self-report (n = 29) and annotations by raters (n = 26). Text-based features contributed more to model accuracy than audio markers. Patients’ clinical presentation (n = 34), response to intervention (n = 11), intervention monitoring (n = 20), providers’ characteristics (n = 12), relational dynamics (n = 14), and data preparation (n = 4) were commonly investigated clinical categories. Limitations of reviewed studies included lack of linguistic diversity, limited reproducibility, and population bias. A research framework is developed and validated (NLPxMHI) to assist computational and clinical researchers in addressing the remaining gaps in applying NLP to MHI, with the goal of improving clinical utility, data access, and fairness.

show abstract

Approaching Human Performance in Behavior Estimation in Couples Therapy Using Deep Sentence Embeddings

Cited by 12 publications

References 20 publications

Unsupervised online multitask learning of behavioral sentence embeddings

Unsupervised online multitask learning of behavioral sentence embeddings

An analysis of observation length requirements for machine understanding of human behaviors from spoken language

Natural language processing for mental health interventions: a systematic review and research framework

Contact Info

Product

Resources

About