52% Yes, a signiicant crisis 3% No, there is no crisis 7% Don't know 38% Yes, a slight crisis 38% Yes, a slight crisis 1,576 RESEARCHERS SURVEYED M ore than 70% of researchers have tried and failed to reproduce another scientist's experiments, and more than half have failed to reproduce their own experiments. Those are some of the telling figures that emerged from Nature's survey of 1,576 researchers who took a brief online questionnaire on reproducibility in research. The data reveal sometimes-contradictory attitudes towards reproduc-ibility. Although 52% of those surveyed agree that there is a significant 'crisis' of reproducibility, less than 31% think that failure to reproduce published results means that the result is probably wrong, and most say that they still trust the published literature. Data on how much of the scientific literature is reproducible are rare and generally bleak. The best-known analyses, from psychology 1 and cancer biology 2 , found rates of around 40% and 10%, respectively. Our survey respondents were more optimistic: 73% said that they think that at least half of the papers in their field can be trusted, with physicists and chemists generally showing the most confidence. The results capture a confusing snapshot of attitudes around these issues, says Arturo Casadevall, a microbiologist at the Johns Hopkins Bloomberg School of Public Health in Baltimore, Maryland. "At the current time there is no consensus on what reproducibility is or should be. " But just recognizing that is a step forward, he says. "The next step may be identifying what is the problem and to get a consensus. "
Access to data is a critical feature of an efficient, progressive and ultimately self-correcting scientific ecosystem. But the extent to which in-principle benefits of data sharing are realized in practice is unclear. Crucially, it is largely unknown whether published findings can be reproduced by repeating reported analyses upon shared data (‘analytic reproducibility’). To investigate this, we conducted an observational evaluation of a mandatory open data policy introduced at the journal Cognition. Interrupted time-series analyses indicated a substantial post-policy increase in data available statements (104/417, 25% pre-policy to 136/174, 78% post-policy), although not all data appeared reusable (23/104, 22% pre-policy to 85/136, 62%, post-policy). For 35 of the articles determined to have reusable data, we attempted to reproduce 1324 target values. Ultimately, 64 values could not be reproduced within a 10% margin of error. For 22 articles all target values were reproduced, but 11 of these required author assistance. For 13 articles at least one value could not be reproduced despite author assistance. Importantly, there were no clear indications that original conclusions were seriously impacted. Mandatory open data policies can increase the frequency and quality of data sharing. However, suboptimal data curation, unclear analysis specification and reporting errors can impede analytic reproducibility, undermining the utility of data sharing and the credibility of scientific findings.
Android robots are entering human social life. However, human-robot interactions may be complicated by a hypothetical Uncanny Valley (UV) in which imperfect human-likeness provokes dislike. Previous investigations using unnaturally blended images reported inconsistent UV effects. We demonstrate an UV in subjects' explicit ratings of likability for a large, objectively chosen sample of 80 real-world robot faces and a complementary controlled set of edited faces. An "investment game" showed that the UV penetrated even more deeply to influence subjects' implicit decisions concerning robots' social trustworthiness, and that these fundamental social decisions depend on subtle cues of facial expression that are also used to judge humans. Preliminary evidence suggests category confusion may occur in the UV but does not mediate the likability effect. These findings suggest that while classic elements of human social psychology govern human-robot social interaction, robust UV effects pose a formidable android-specific problem.
In this paper, we propose a new template for empirical studies intended to assess causal effects: the outcome-wide longitudinal design. The approach is an extension of what is often done to assess the causal effects of a treatment or exposure using confounding control, but now, over numerous outcomes. We discuss the temporal and confounding control principles for such outcome-wide studies, metrics to evaluate robustness or sensitivity to potential unmeasured confounding for each outcome and approaches to handle multiple testing. We argue that the outcome-wide longitudinal design has numerous advantages over more traditional studies of single exposureoutcome relationships including results that are less subject to investigator bias, greater potential to report null effects, greater capacity to compare effect sizes, a tremendous gain in the efficiency for the research community, a greater policy relevance and a more rapid advancement of knowledge. We discuss both the practical and theoretical justification for the outcome-wide longitudinal design and also the pragmatic details of its implementation, providing publicly available R code.
Importance Psychological stress contributes to numerous diseases and may do so in part through damage to telomeres, protective non-coding segments on the ends of chromosomes. Objective We conducted a systematic review and meta-analysis to determine the association between self-reported, perceived psychological stress (PS) and telomere length (TL). Data Sources We searched 3 databases (PubMed, PsycInfo, and Scopus), completed manual searches of published and unpublished studies, and contacted all study authors to obtain potentially relevant data. Study Selection Two independent reviewers assessed studies for original research measuring (but not necessarily reporting the correlation between) PS and TL in human subjects. 23 studies met inclusion criteria; 22 (totaling 8,948 subjects) could be meta-analyzed. Data Extraction and Synthesis We assessed study quality using modified MINORS criteria. Since not all included studies reported PS-TL correlations, we obtained them via direct calculation from author-provided data (7 studies), contact with authors (14 studies), or extraction from the published article (1 study). Main Outcomes and Measures We conducted random-effects meta-analysis on our primary outcome, the age-adjusted PS-TL correlation. We investigated potential confounders and moderators (sex, life stress exposure, and PS measure validation) via post hoc subset analyses and meta-regression. Results Increased PS was associated with a very small decrease in TL (n = 8,724 total; r = −0.06; 95% CI: −0.10, −0.008; p = 0.01; α = 0.025), adjusting for age. This relationship was similar between sexes and within studies using validated measures of PS, and marginally (nonsignificantly) stronger among samples recruited for stress exposure (r = −0.13; vs. general samples: b = −0.11; 95% CI: −0.27, 0.01; p = 0.05; α = 0.013). Publication bias may exist; correcting for its effects attenuated the relationship. Conclusions and Relevance Our analysis finds a very small, statistically significant relationship between increased PS (as measured over the past month) and decreased TL that may reflect publication bias. The association may be stronger with known major stressors and is similar in magnitude to that noted between obesity and TL. All included studies used single measures of short-term stress; the literature suggests long-term chronic stress may have a larger cumulative effect. Future research should assess for potential confounders and use longitudinal, multidimensional models of stress.
Replicability is an important feature of scientific research, but aspects of contemporary research culture, such as an emphasis on novelty, can make replicability seem less important than it should be. The Reproducibility Project: Cancer Biology was set up to provide evidence about the replicability of preclinical research in cancer biology by repeating selected experiments from high-impact papers. A total of 50 experiments from 23 papers were repeated, generating data about the replicability of a total of 158 effects. Most of the original effects were positive effects (136), with the rest being null effects (22). A majority of the original effect sizes were reported as numerical values (117), with the rest being reported as representative images (41). We employed seven methods to assess replicability, and some of these methods were not suitable for all the effects in our sample. One method compared effect sizes: for positive effects, the median effect size in the replications was 85% smaller than the median effect size in the original experiments, and 92% of replication effect sizes were smaller than the original. The other methods were binary – the replication was either a success or a failure – and five of these methods could be used to assess both positive and null effects when effect sizes were reported as numerical values. For positive effects, 40% of replications (39/97) succeeded according to three or more of these five methods, and for null effects 80% of replications (12/15) were successful on this basis; combining positive and null effects, the success rate was 46% (51/112). A successful replication does not definitively confirm an original finding or its theoretical interpretation. Equally, a failure to replicate does not disconfirm a finding, but it does suggest that additional investigation is needed to establish its reliability.
We propose sensitivity analyses for publication bias in meta-analyses. We consider a publication process such that 'statistically significant' results are more likely to be published than negative or 'non-significant' results by an unknown ratio, η. Our proposed methods also accommodate some plausible forms of selection based on a study's standard error. Using inverse probability weighting and robust estimation that accommodates non-normal population effects, small meta-analyses, and clustering, we develop sensitivity analyses that enable statements such as 'For publication bias to shift the observed point estimate to the null, "significant" results would need to be at least 30 fold more likely to be published than negative or "non-significant" results'. Comparable statements can be made regarding shifting to a chosen non-null value or shifting the confidence interval. To aid interpretation, we describe empirical benchmarks for plausible values of η across disciplines. We show that a worst-case meta-analytic point estimate for maximal publication bias under the selection model can be obtained simply by conducting a standard meta-analysis of only the negative and 'non-significant' studies; this method sometimes indicates that no amount of such publication bias could 'explain away' the results. We illustrate the proposed methods by using real meta-analyses and provide an R package: PublicationBias.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.