BackgroundQualitative research methods are increasingly being used across disciplines because of their ability to help investigators understand the perspectives of participants in their own words. However, qualitative analysis is a laborious and resource-intensive process. To achieve depth, researchers are limited to smaller sample sizes when analyzing text data. One potential method to address this concern is natural language processing (NLP). Qualitative text analysis involves researchers reading data, assigning code labels, and iteratively developing findings; NLP has the potential to automate part of this process. Unfortunately, little methodological research has been done to compare automatic coding using NLP techniques and qualitative coding, which is critical to establish the viability of NLP as a useful, rigorous analysis procedure.ObjectiveThe purpose of this study was to compare the utility of a traditional qualitative text analysis, an NLP analysis, and an augmented approach that combines qualitative and NLP methods.MethodsWe conducted a 2-arm cross-over experiment to compare qualitative and NLP approaches to analyze data generated through 2 text (short message service) message survey questions, one about prescription drugs and the other about police interactions, sent to youth aged 14-24 years. We randomly assigned a question to each of the 2 experienced qualitative analysis teams for independent coding and analysis before receiving NLP results. A third team separately conducted NLP analysis of the same 2 questions. We examined the results of our analyses to compare (1) the similarity of findings derived, (2) the quality of inferences generated, and (3) the time spent in analysis.ResultsThe qualitative-only analysis for the drug question (n=58) yielded 4 major findings, whereas the NLP analysis yielded 3 findings that missed contextual elements. The qualitative and NLP-augmented analysis was the most comprehensive. For the police question (n=68), the qualitative-only analysis yielded 4 primary findings and the NLP-only analysis yielded 4 slightly different findings. Again, the augmented qualitative and NLP analysis was the most comprehensive and produced the highest quality inferences, increasing our depth of understanding (ie, details and frequencies). In terms of time, the NLP-only approach was quicker than the qualitative-only approach for the drug (120 vs 270 minutes) and police (40 vs 270 minutes) questions. An approach beginning with qualitative analysis followed by qualitative- or NLP-augmented analysis took longer time than that beginning with NLP for both drug (450 vs 240 minutes) and police (390 vs 220 minutes) questions.ConclusionsNLP provides both a foundation to code qualitatively more quickly and a method to validate qualitative findings. NLP methods were able to identify major themes found with traditional qualitative analysis but were not useful in identifying nuances. Traditional qualitative text analysis added important details and context.
BackgroundThere has been little progress in adolescent health outcomes in recent decades. Researchers and youth-serving organizations struggle to accurately elicit youth voice and translate youth perspectives into health care policy.ObjectiveOur aim is to describe the protocol of the MyVoice Project, a longitudinal mixed methods study designed to engage youth, particularly those not typically included in research. Text messaging surveys are collected, analyzed, and disseminated in real time to leverage youth perspectives to impact policy.MethodsYouth aged 14 to 24 years are recruited to receive weekly text message surveys on a variety of policy and health topics. The research team, including academic researchers, methodologists, and youth, develop questions through an iterative writing and piloting process. Question topics are elicited from community organizations, researchers, and policy makers to inform salient policies. A youth-centered interactive platform has been developed that automatically sends confidential weekly surveys and incentives to participants. Parental consent is not required because the survey is of minimal risk to participants. Recruitment occurs online (eg, Facebook, Instagram, university health research website) and in person at community events. Weekly surveys collect both quantitative and qualitative data. Quantitative data are analyzed using descriptive statistics. Qualitative data are quickly analyzed using natural language processing and traditional qualitative methods. Mixed methods integration and analysis supports a more in-depth understanding of the research questions.ResultsWe are currently recruiting and enrolling participants through in-person and online strategies. Question development, weekly data collection, data analysis, and dissemination are in progress.ConclusionsMyVoice quickly ascertains the thoughts and opinions of youth in real time using a widespread, readily available technology—text messaging. Results are disseminated to researchers, policy makers, and youth-serving organizations through a variety of methods. Policy makers and organizations also share their priority areas with the research team to develop additional question sets to inform important policy decisions. Youth-serving organizations can use results to make decisions to promote youth well-being.
Existing fact-finding models assume availability of structured data or accurate information extraction. However, as online data gets more unstructured, these assumptions are no longer valid. To overcome this, we propose a novel, content-based, trust propagation framework that relies on signals from the textual content to ascertain veracity of freetext claims and compute trustworthiness of their sources. We incorporate the quality of relevant content into the framework and present an iterative algorithm for propagation of trust scores. We show that existing fact finders on structured data can be modeled as specific instances of this framework. Using a retrieval-based approach to find relevant articles, we instantiate the framework to compute trustworthiness of news sources and articles. We show that the proposed framework helps ascertain trustworthiness of sources better. We also show that ranking news articles based on trustworthiness learned from the content-driven framework is significantly better than baselines that ignore either the content quality or the trust framework.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.