to clarify some misstated data in the first paragraph.
Research Summary We assess the predictive validity and differential prediction by race of one pretrial risk assessment, the Public Safety Assessment (PSA). The PSA was developed with support from the Laura and John Arnold Foundation (LJAF) to reduce the burden placed on vulnerable populations at the front end of the criminal justice system. The growing and disparate use of incarceration is one of the most pressing social issues facing the United States. The implementation of risk assessments has provided fuel for both sides of the reform debate with proponents arguing that the use of these assessments offers a policy mechanism to alleviate populations and bias. Risk assessment critics, however, argue that the use of the assessments exacerbates bias and does not improve decision‐making. By examining a statewide data set from Kentucky (N = 164,597), we found the PSA to have predictive validity measures in line with what are generally accepted within the criminal justice field. The differences we found indicate the PSA scores for failure to appear (FTA) are moderated by race, but these differences do not lead to disparate impact. Policy Implications We point to data limitations and the need for localized risk assessment studies, and we emphasize that risk assessments are decision‐making tools that require ongoing refinement. Risk assessment developers, opponents, and proponents would do better to focus on the reality of risk assessments as probabilistic models. The results of these assessments cannot predict with certainty, and they are not inherently biased. Rather, criminologists and policy makers need to understand the uncertainty that comes with any predictive model.
Background Electronic nicotine delivery system (ENDS) brands, such as JUUL, used social media as a key component of their marketing strategy, which led to massive sales growth from 2015 to 2018. During this time, ENDS use rapidly increased among youths and young adults, with flavored products being particularly popular among these groups. Objective The aim of our study is to develop a named entity recognition (NER) model to identify potential emerging vaping brands and flavors from Instagram post text. NER is a natural language processing task for identifying specific types of words (entities) in text based on the characteristics of the entity and surrounding words. Methods NER models were trained on a labeled data set of 2272 Instagram posts coded for ENDS brands and flavors. We compared three types of NER models—conditional random fields, a residual convolutional neural network, and a fine-tuned distilled bidirectional encoder representations from transformers (FTDB) network—to identify brands and flavors in Instagram posts with key model outcomes of precision, recall, and F1 scores. We used data from Nielsen scanner sales and Wikipedia to create benchmark dictionaries to determine whether brands from established ENDS brand and flavor lists were mentioned in the Instagram posts in our sample. To prevent overfitting, we performed 5-fold cross-validation and reported the mean and SD of the model validation metrics across the folds. Results For brands, the residual convolutional neural network exhibited the highest mean precision (0.797, SD 0.084), and the FTDB exhibited the highest mean recall (0.869, SD 0.103). For flavors, the FTDB exhibited both the highest mean precision (0.860, SD 0.055) and recall (0.801, SD 0.091). All NER models outperformed the benchmark brand and flavor dictionary look-ups on mean precision, recall, and F1. Comparing between the benchmark brand lists, the larger Wikipedia list outperformed the Nielsen list in both precision and recall. Conclusions Our findings suggest that NER models correctly identified ENDS brands and flavors in Instagram posts at rates competitive with, or better than, others in the published literature. Brands identified during manual annotation showed little overlap with those in Nielsen scanner data, suggesting that NER models may capture emerging brands with limited sales and distribution. NER models address the challenges of manual brand identification and can be used to support future infodemiology and infoveillance studies. Brands identified on social media should be cross-validated with Nielsen and other data sources to differentiate emerging brands that have become established from those with limited sales and distribution.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.