Previous investigations into detecting mental illnesses through social media have predominately focused on detecting depression through Twitter corpora (De Choudhury et al., 2013;Resnik et al., 2015;Pedersen, 2015). In this paper, we study anxiety disorders through personal narratives collected through the popular social media website, Reddit. We build a substantial data set of typical and anxietyrelated posts, and we apply N -gram language modeling, vector embeddings, topic analysis, and emotional norms to generate features that accurately classify posts related to binary levels of anxiety. We achieve an accuracy of 91% with vectorspace word embeddings, and an accuracy of 98% when combined with lexiconbased features.
Health Statistics (NCHS) show that the black population has the highest proportion of overweight among all adult populations in the United States. The present study compared the body mass index (BMI) and body fat percent from dual-photon absorptiometry in 1,324 healthy adults aged 18 to 107 years recruited from four ethnic groups in the New York City area; 523 whites, 280 blacks, 267 Asians and 254 Puerto Ricans. Puerto Ricans had the largest BMI and the largest percent of subjects with body weight more than 120% of their ideal weight, and the largest fat percent of the four ethnic groups: 76% of Puerto Rican males had fat percent above the median value for white males (fat percent = 19.6%) and 95% of Puerto Rican females had fat percent above the median for white females (fat percent =30.8%). Asians had the smallest BMI, but 63% of them had fat percent above the median values for whites in each gender. Puerto Ricans also had the largest waist-to-hip ratios among the four ethnic groups. In blacks, the percent of subjects with fat percent larger than the median for whites was slightly smaller than that for Puerto Ricans, 64 % and 82% of males and females respectively. These results differ from the latest NCHS data and show that Puerto Ricans in this sample are heavier and fatter than blacks.
Open-domain dialog generation is a challenging problem; maximum likelihood training can lead to repetitive outputs, models have difficulty tracking long-term conversational goals, and training on standard movie or online datasets may lead to the generation of inappropriate, biased, or offensive text. Reinforcement Learning (RL) is a powerful framework that could potentially address these issues, for example by allowing a dialog model to optimize for reducing toxicity and repetitiveness. However, previous approaches which apply RL to open-domain dialog generation do so at the word level, making it difficult for the model to learn proper credit assignment for long-term conversational rewards. In this paper, we propose a novel approach to hierarchical reinforcement learning (HRL), VHRL, which uses policy gradients to tune the utterance-level embedding of a variational sequence model. This hierarchical approach provides greater flexibility for learning long-term, conversational rewards. We use self-play and RL to optimize for a set of human-centered conversation metrics, and show that our approach provides significant improvements – in terms of both human evaluation and automatic metrics – over state-of-the-art dialog models, including Transformers.
How can we train a dialog model to produce better conversations by learning from human feedback, without the risk of humans teaching it harmful chat behaviors? We start by hosting models online, and gather human feedback from real-time, open-ended conversations, which we then use to train and improve the models using offline reinforcement learning (RL). We identify implicit conversational cues including language similarity, elicitation of laughter, sentiment, and more, which indicate positive human feedback, and embed these in multiple reward functions. A wellknown challenge is that learning an RL policy in an offline setting usually fails due to the lack of ability to explore and the tendency to make over-optimistic estimates of future reward. These problems become even harder when using RL for language models, which can easily have a 20,000 action vocabulary and many possible reward functions. We solve the challenge by developing a novel class of offline RL algorithms. These algorithms use KL-control to penalize divergence from a pretrained prior language model, and use a new strategy to make the algorithm pessimistic, instead of optimistic, in the face of uncertainty. We test the resulting dialog model with ratings from 80 users in an open-domain setting and find it achieves significant improvements over existing deep offline RL approaches. The novel offline RL method is viable for improving any existing generative dialog model using a static dataset of human feedback.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.