Obtaining grant funding from the National Institutes of Health (NIH) is increasingly competitive, as funding success rates have declined over the past decade. To allocate relatively scarce funds, scientific peer reviewers must differentiate the very best applications from comparatively weaker ones. Despite the importance of this determination, little research has explored how reviewers assign ratings to the applications they review and whether there is consistency in the reviewers' evaluation of the same application. Replicating all aspects of the NIH peer-review process, we examined 43 individual reviewers' ratings and written critiques of the same group of 25 NIH grant applications. Results showed no agreement among reviewers regarding the quality of the applications in either their qualitative or quantitative evaluations. Although all reviewers received the same instructions on how to rate applications and format their written critiques, we also found no agreement in how reviewers "translated" a given number of strengths and weaknesses into a numeric rating. It appeared that the outcome of the grant review depended more on the reviewer to whom the grant was assigned than the research proposed in the grant. This research replicates the NIH peer-review process to examine in detail the qualitative and quantitative judgments of different reviewers examining the same application, and our results have broad relevance for scientific grant peer review.
Background Prior text analysis of R01 critiques suggested that female applicants may be disadvantaged in NIH peer review, particularly for R01 renewals. NIH altered its review format in 2009. The authors examined R01 critiques and scoring in the new format for differences due to principal investigator (PI) sex. Method The authors analyzed 739 critiques—268 from 88 unfunded and 471 from 153 funded applications for grants awarded to 125 PIs (M = 76, 61% F = 49, 39%) at the University of Wisconsin-Madison between 2010 and 2014. The authors used 7 word categories for text analysis: ability, achievement, agentic, negative evaluation, positive evaluation, research, and standout adjectives. The authors used regression models to compare priority and criteria scores, and results from text analysis for differences due to PI sex and whether the application was for a new (Type 1) or renewal (Type 2) R01. Results Approach scores predicted priority scores for all PIs’ applications (P<.001); but scores and critiques differed significantly for male and female PIs’ Type 2 applications. Reviewers assigned significantly worse priority, approach, and significance scores to female than male PIs’ Type 2 applications, despite using standout adjectives (e.g., “outstanding,” “excellent”) and making references to ability in more of their critiques (P<.05 for all comparisons). Conclusions The authors’ analyses suggest that subtle gender bias may continue to operate in the post-2009 NIH review format in ways that could lead reviewers to implicitly hold male and female applicants to different standards of evaluation, particularly for R01 renewals.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.