Katarzyna Gawryluk scite author profile

Replication studies in psychological science sometimes fail to reproduce prior findings. If these studies use methods that are unfaithful to the original study or ineffective in eliciting the phenomenon of interest, then a failure to replicate may be a failure of the protocol rather than a challenge to the original finding. Formal pre-data-collection peer review by experts may address shortcomings and increase replicability rates. We selected 10 replication studies from the Reproducibility Project: Psychology (RP:P; Open Science Collaboration, 2015) for which the original authors had expressed concerns about the replication designs before data collection; only one of these studies had yielded a statistically significant effect ( p < .05). Commenters suggested that lack of adherence to expert review and low-powered tests were the reasons that most of these RP:P studies failed to replicate the original effects. We revised the replication protocols and received formal peer review prior to conducting new replication studies. We administered the RP:P and revised protocols in multiple laboratories (median number of laboratories per original study = 6.5, range = 3–9; median total sample = 1,279.5, range = 276–3,512) for high-powered tests of each original finding with both protocols. Overall, following the preregistered analysis plan, we found that the revised protocols produced effect sizes similar to those of the RP:P protocols (Δ r = .002 or .014, depending on analytic approach). The median effect size for the revised protocols ( r = .05) was similar to that of the RP:P protocols ( r = .04) and the original RP:P replications ( r = .11), and smaller than that of the original studies ( r = .37). Analysis of the cumulative evidence across the original studies and the corresponding three replication attempts provided very precise estimates of the 10 tested effects and indicated that their effect sizes (median r = .07, range = .00–.15) were 78% smaller, on average, than the original effect sizes (median r = .37, range = .19–.50).

show abstract

Cheating among children: Temptation, loss framing, and previous cheating

Markiewicz

Gawryluk

2019

Behavioral Decision Making

View full text Add to dashboard Cite

Although early economic approaches to misbehavior merely compare the monetary utility of accessible options, self‐concept maintenance models introduce moral considerations to the equation. These assume that people trade off possible gains to be made from moral transgressions with associated decreases in self‐esteem. On the basis of the assumption that the development of moral values among children is weaker than in adults, we expected children's behavior to be close to that of the hypothetical homo economicus, with their decisions as to whether to cheat therefore being influenced by decision theory elements: normative elements such as temptation magnitude, and behavioral elements such as framing. As children should pay no heed to moral considerations in dynamic multiple task settings, behavior opposite to “moral cleansing” was expected, with a first lie predicting later lies. In addition to testing the above ideas, the present study adopted a novel methodology. Our hypotheses were tested in a lab‐in‐the‐field study using a modified “roll a die” method in a naturalistic setting for children of ages 7 to 10. We modified the method to identify both true and declared values of die rolls (a novel DICE+ electronic die was used). As expected, children were sensitive to temptation and cheated more willingly for more attractive prizes. Girls (but not boys) lied more (in terms of both frequency and magnitude) to avoid losses (with loss framing) than to make gains (with gain framing). Previous lying correlated positively with lying on subsequent tasks.

show abstract

A belief in trend reversal requires access to cognitive resources

Tyszka

Markiewicz

Kubińska

et al. 2016

Journal of Cognitive Psychology

View full text Add to dashboard Cite

Many Labs 5: Testing pre-data collection peer review as an intervention to increase replicability

Ebersole¹,

Mathur²,

Baranski³

et al. 2019

Preprint

View full text Add to dashboard Cite

Replication efforts in psychological science sometimes fail to replicate prior findings. If replications use methods that are unfaithful to the original study or ineffective in eliciting the phenomenon of interest, then a failure to replicate may be a failure of the replication protocol rather than a challenge to the original finding. Formal pre-data collection peer review by experts may address shortcomings and increase replicability rates. We selected 10 replications from the Reproducibility Project: Psychology (RP:P; Open Science Collaboration, 2015) in which the original authors had expressed concerns about the replication designs before data collection and only one of which was “statistically significant” (p < .05). Commenters on RP:P suggested that lack of adherence to expert review and low-powered tests were the reasons that most of these failed to replicate (Gilbert et al., 2016). We revised the replication protocols and received formal peer review prior to conducting new replications. We administered the RP:P and Revised replication protocols in multiple laboratories (Median number of laboratories per original study = XX; Range XX to YY; Median total sample = XX; Range XX to YY) for high-powered tests of each original finding with both protocols. Overall, XX of 10 RP:P protocols and XX of 10 Revised protocols showed significant evidence in the same direction as the original finding (p < .05), compared to an expected XX. The median effect size was [larger/smaller/similar] for Revised protocols (ES = .XX) compared to RP:P protocols (ES = .XX), and [larger/smaller/similar] compared to the original studies (ES = .XX) and [larger/smaller/similar] compared to the original RP:P replications (ES = .XX). Overall, Revised protocols produced [much larger/somewhat larger/similar] effect sizes compared to RP:P protocols (ES = .XX). We also elicited peer beliefs about the replications through prediction markets and surveys of a group of researchers in psychology. The peer researchers predicted that the Revised protocols would [decrease/not affect/increase] the replication rate, [consistent with/not consistent with] the observed replication results. The results suggest that the lack of replicability of these findings observed in RP:P was [partly/completely/not] due to discrepancies in the RP:P protocols that could be resolved with expert peer review.

show abstract

Many Labs 5: Registered Replication of Förster, Liberman, and Kuschel’s (2008) Study 1

IJzerman

Ropovik

Ebersole

et al. 2020

Advances in Methods and Practices in Psychological Science

View full text Add to dashboard Cite

In a test of their global-/local-processing-style model, Förster, Liberman, and Kuschel (2008) found that people assimilate a primed concept (e.g., “aggressive”) into their social judgments after a global prime (e.g., they rate a person as being more aggressive than do people in a no-prime condition) but contrast their judgment away from the primed concept after a local prime (e.g., they rate the person as being less aggressive than do people in a no prime-condition). This effect was not replicated by Reinhard (2015) in the Reproducibility Project: Psychology. However, the authors of the original study noted that the replication could not provide a test of the moderation effect because priming did not occur. They suggested that the primes might have been insufficiently applicable and the scenarios insufficiently ambiguous to produce priming. In the current replication project, we used both Reinhard’s protocol and a revised protocol that was designed to increase the likelihood of priming, to test the original authors’ suggested explanation for why Reinhard did not observe the moderation effect. Teams from nine universities contributed to this project. We first conducted a pilot study ( N = 530) and successfully selected ambiguous scenarios for each site. We then pilot-tested the aggression prime at five different sites ( N = 363) and found that it did not successfully produce priming. In agreement with the first author of the original report, we replaced the prime with a task that successfully primed aggression (hostility) in a pilot study by McCarthy et al. (2018). In the final replication study ( N = 1,460), we did not find moderation by protocol type, and judgment patterns in both protocols were inconsistent with the effects observed in the original study. We discuss these findings and possible explanations.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.