Ergodicity, ensembles, irreversibility in Boltzmann and beyond

52% Yes, a signiicant crisis 3% No, there is no crisis 7% Don't know 38% Yes, a slight crisis 38% Yes, a slight crisis 1,576 RESEARCHERS SURVEYED M ore than 70% of researchers have tried and failed to reproduce another scientist's experiments, and more than half have failed to reproduce their own experiments. Those are some of the telling figures that emerged from Nature's survey of 1,576 researchers who took a brief online questionnaire on reproducibility in research. The data reveal sometimes-contradictory attitudes towards reproduc-ibility. Although 52% of those surveyed agree that there is a significant 'crisis' of reproducibility, less than 31% think that failure to reproduce published results means that the result is probably wrong, and most say that they still trust the published literature. Data on how much of the scientific literature is reproducible are rare and generally bleak. The best-known analyses, from psychology 1 and cancer biology 2 , found rates of around 40% and 10%, respectively. Our survey respondents were more optimistic: 73% said that they think that at least half of the papers in their field can be trusted, with physicists and chemists generally showing the most confidence. The results capture a confusing snapshot of attitudes around these issues, says Arturo Casadevall, a microbiologist at the Johns Hopkins Bloomberg School of Public Health in Baltimore, Maryland. "At the current time there is no consensus on what reproducibility is or should be. " But just recognizing that is a step forward, he says. "The next step may be identifying what is the problem and to get a consensus. "

show abstract

Many Analysts, One Data Set: Making Transparent How Variations in Analytic Choices Affect Results

Silberzahn

Uhlmann

Martin

et al. 2018

Advances in Methods and Practices in Psychological Science

592

547

View full text Add to dashboard Cite

This article was originally submitted for publication to the Editor of Advances in Methods and Practices in Psychological Science (AMPPS) in 2015. When the submitted manuscript was subsequently posted online (Silberzahn et al., 2015), it received some media attention, and two of the authors were invited to write a brief commentary in Nature advocating for greater crowdsourcing of data analysis by scientists. This commentary, arguing that crowdsourced research "can balance discussions, validate findings and better inform policy" (Silberzahn & Uhlmann, 2015, p. 189), included a new figure that displayed the analytic teams' effectsize estimates and cited the submitted manuscript as the source of the findings, with a link to the preprint. However, the authors forgot to add a citation of the Nature commentary to the final published version of the AMPPS article or to note that the main findings had been previously publicized via the commentary, the online preprint, research presentations at conferences and universities, and media reports by other people. The authors regret the oversight.

show abstract

Assessing the validity of single-item life satisfaction measures: results from three large samples

2014

View full text Add to dashboard Cite

Purpose-The present paper assessed the validity of single-item life satisfaction measures by comparing single-item measures to the Satisfaction with Life Scale (SWLS) -a more psychometrically established measure. Washington (N=13,064) and Oregon (N=2,277) recruited by the Behavioral Risk Factor Surveillance System (BRFSS) and a representative German sample (N=1,312) recruited by the Germany Socio-Economic Panel (GSOEP) were included in the present analyses. Single-item life satisfaction measures and the SWLS were correlated with theoretically relevant variables, such as demographics, subjective health, domain satisfaction, and affect. The correlations between the two life satisfaction measures and these variables were examined to assess the construct validity of single-item life satisfaction measures. Methods-Two large samples fromResults-Consistent across three samples, single-item life satisfaction measures demonstrated substantial degree of criterion validity with the SWLS (zero-order r = 0.62 -0.64; disattenuated r = 0.78 -0.80). Patterns of statistical significance for correlations with theoretically relevant variables were the same across single-item measures and the SWLS. Single-item measures did not produce systematically different correlations compared to the SWLS (average difference = 0.001 -0.005). The average absolute difference in the magnitudes of the correlations produced by singleitem measures and the SWLS were very small (average absolute difference = 0.015 −0.042).Conclusions-Single-item life satisfaction measures performed very similarly compared to the multiple-item SWLS. Social scientists would get virtually identical answer to substantive questions regardless of which measure they use.Keywords life satisfaction; single-item measure; Satisfaction with Life Scale; validity; measurement Subjective well-being is an overarching construct that captures the affective feelings and cognitive judgments people have about the quality of their lives. Life satisfaction is a component of subjective well-being that reflects the cognitive evaluation of whether one is happy with one's life. Understanding life satisfaction is important as it is associated with positive life outcomes, such as health [1] , and the Behavioral Risk Factor Surveillance System (BRFSS) [10]. These studies measure many variables from thousands or even millions of respondents, so single-item measures are often used because participant burden is of primary concern. Given the increasing use of single-item life satisfaction measures both in research and policy settings, there is a pressing need to understand the psychometric properties of these measures. The goal of the current paper is to assess the psychometric properties of single-item life satisfaction measures with 3 separate samples totaling over 16,000 participants.When evaluating the psychometric properties of a measure, researchers are typically interested in two features: reliability and validity. With regard to reliability, conventional measures that rely on internal...

show abstract

TCM: Made in China

Cheung

2011

Nature

400

268

View full text Add to dashboard Cite

Income inequality is associated with stronger social comparison effects: The effect of relative income on life satisfaction.

Cheung¹,

Lucas²

2016

Journal of Personality and Social Psychology

210

165

View full text Add to dashboard Cite

Previous research has shown that having rich neighbors is associated with reduced levels of subjective well-being, an effect that is likely due to social comparison. The current study examined the role of income inequality as a moderator of this relative income effect. Multilevel analyses were conducted on a sample of over 1.7 million people from 2,425 counties in the United States. Results showed that higher income inequality was associated with stronger relative income effects. In other words, people were more strongly influenced by the income of their neighbors when income inequality was high.

show abstract

Crowdsourcing hypothesis tests: Making transparent how design choices shape research results.

Landy

Jia

Ding

et al. 2020

Psychological Bulletin

139

120

View full text Add to dashboard Cite

Author contributions: The 1 st through 4 th and last authors developed the research questions, oversaw the project, and contributed equally. The 1 st through 3 rd authors oversaw the Main Studies and Replication Studies, and the 4 th , 6 th , 7 th , and 8 th authors oversaw the Forecasting Study. The 1 st , 4 th , 5 th , 8 th , and 9 th authors conducted the primary analyses. The 10 th through 15 th authors conducted the Bayesian analyses. The first and 16 th authors conducted the multivariate meta-analysis.

show abstract

The pipeline project: Pre-publication independent replications of a single laboratory's research pipeline

Schweinsberg

Madan

Vianello

et al. 2016

Journal of Experimental Social Psychology

104

View full text Add to dashboard Cite

This crowdsourced project introduces a collaborative approach to improving the reproducibility of scientific research, in which findings are replicated in qualified independent laboratories before (rather than after) they are published. Our goal is to establish a non-adversarial replication process with highly informative final results. To illustrate the Pre-Publication Independent Replication (PPIR) approach, 25 research groups conducted replications of all ten moral judgment effects which the last author and his collaborators had "in the pipeline" as of August 2014. Six findings replicated according to all replication criteria, one finding replicated but with a significantly smaller effect size than the original, one finding replicated consistently in the original culture but not outside of it, and two findings failed to find support. In total, 40% of the original findings failed at least one major replication criterion. Potential ways to implement and incentivize pre-publication independent replication on a large scale are discussed

show abstract

Does Cleanliness Influence Moral Judgments?

2014

View full text Add to dashboard Cite

Schnall, Benton, and Harvey (2008) hypothesized that physical cleanliness reduces the severity of moral judgments. In support of this idea, they found that individuals make less severe judgments when they are primed with the concept of cleanliness (Exp. 1) and when they wash their hands after experiencing disgust (Exp. 2). We conducted direct replications of both studies using materials supplied by the original authors. We did not find evidence that physical cleanliness reduced the severity of moral judgments using samples sizes that provided over .99 power to detect the original effect sizes. Our estimates of the overall effect size were much smaller than estimates from Experiment 1 (original d = −0.60, 95% CI [−1.23, 0.04], N = 40; replication d = −0.01, 95% CI [−0.28, 0.26], N = 208) and Experiment 2 (original d = −0.85, 95% CI [−1.47, −0.22], N = 43; replication d = 0.01, 95% CI [−.34, 0.36], N = 126). These findings suggest that the population effect sizes are probably substantially smaller than the original estimates. Researchers investigating the connections between cleanliness and morality should therefore use large sample sizes to have the necessary power to detect subtle effects.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Felix Cheung

Estimating the reproducibility of psychological science

Many Analysts, One Data Set: Making Transparent How Variations in Analytic Choices Affect Results

Assessing the validity of single-item life satisfaction measures: results from three large samples

TCM: Made in China

Income inequality is associated with stronger social comparison effects: The effect of relative income on life satisfaction.

Crowdsourcing hypothesis tests: Making transparent how design choices shape research results.

The pipeline project: Pre-publication independent replications of a single laboratory's research pipeline

Does Cleanliness Influence Moral Judgments?

Contact Info

Product

Resources

About