2017
DOI: 10.3758/s13428-017-0885-7
|View full text |Cite
|
Sign up to set email alerts
|

The conditional power of randomization tests for single-case effect sizes in designs with randomized treatment order: A Monte Carlo simulation study

Abstract: The conditional power (CP) of the randomization test (RT) was investigated in a simulation study in which three different single-case effect size (ES) measures were used as the test statistic: the mean difference (MD), the Percentage of Nonoverlapping Data (PND), and the Nonoverlap of All Pairs (NAP). Furthermore, we studied the effect of the experimental design on the RT's CP for three different single-case designs with rapid treatment alternation: the completely randomized design (CRD), the randomized block … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

5
20
0

Year Published

2018
2018
2022
2022

Publication Types

Select...
6

Relationship

2
4

Authors

Journals

citations
Cited by 10 publications
(25 citation statements)
references
References 77 publications
5
20
0
Order By: Relevance
“…Therefore, the correspondence between the current findings and previous evidence is high and the conclusions about the performance of the randomization test can be generalized beyond systematic ATDs and beyond the mean difference as a test statistic. Michiels et al (2017) found that the conditional power for ATD-RR is higher than for ATD-RB and our findings are consistent for ALIV as a test statistic. Moreover, despite some notable differences between the two studies (i.e., test statistic, conditional vs. unconditional power studied, two-sided vs. one-sided alternative hypothesis), both suggest that a medium-sized effect (according to the benchmarks proposed by Harrington & Velicer, 2015) such as 2 standard deviations can be detected as statistically significant with sufficient power for ATD-RR with as few as 12 measurement occasions, whereas small effects such as 1 standard deviation require more than 30 measurement occasions.…”
Section: Discussionsupporting
confidence: 87%
See 2 more Smart Citations
“…Therefore, the correspondence between the current findings and previous evidence is high and the conclusions about the performance of the randomization test can be generalized beyond systematic ATDs and beyond the mean difference as a test statistic. Michiels et al (2017) found that the conditional power for ATD-RR is higher than for ATD-RB and our findings are consistent for ALIV as a test statistic. Moreover, despite some notable differences between the two studies (i.e., test statistic, conditional vs. unconditional power studied, two-sided vs. one-sided alternative hypothesis), both suggest that a medium-sized effect (according to the benchmarks proposed by Harrington & Velicer, 2015) such as 2 standard deviations can be detected as statistically significant with sufficient power for ATD-RR with as few as 12 measurement occasions, whereas small effects such as 1 standard deviation require more than 30 measurement occasions.…”
Section: Discussionsupporting
confidence: 87%
“…The present study provides initial simulation evidence on the performance of ALIV+RT (Manolov & Onghena, 2017), and it provides further evidence on the performance of VSC (Lanovaz, Cardinal et al, 2017) for ATD-RR and ATD-RB. In addition, the study also extends the evidence available on the performance of randomization tests with ATDs: (a) Levin et al (2012) studied systematically alternating designs (e.g., ABABABABABAB) with 12 and 24 measurement occasions, (b) Michiels et al (2018) studied the conditional power (Keller, 2012) for ATD-RR and ATD-RB with 12 to 40 measurement occasions, and both (a) and (b) used the mean difference (not ALIV) as a test statistic. In what follows, we compare the current results with these previous findings.…”
Section: Discussionmentioning
confidence: 64%
See 1 more Smart Citation
“…Using randomization tests 20 randomized baselines are necessary to create the option of a minimal p value of 0.05 per participant. Regarding statistical power, with a minimum number of measures of 14 per phase, a minimum of 62 measures in total for the participant with the shortest baseline, and 20 participants, previous simulation studies indicate that this study can reach sufficient statistical power (Heyvaert et al, 2017; Michiels et al, 2018). The 20 baselines varying from 14 to 33 days were randomized and then allocated to participants who met inclusion criteria.…”
Section: Methodsmentioning
confidence: 91%
“…The RT is also flexible with regard to the choice of the test statistic (Ferron & Sentovich, 2002;Onghena, 1992;Onghena & Edgington, 2005). For example, it is possible to use an ES measure based on standardized mean differences as the test statistic in the RT , but also ES measures based on data nonoverlap (Heyvaert & Onghena, 2014;Michiels, Heyvaert, & Onghena, 2018). This freedom to devise a test statistic that fits the research question makes the RT a versatile statistical tool for various research settings and treatment effects (e.g., with mean level differences, trends, or changes in variability; Dugard, 2014).…”
Section: Data Analysis Of Randomized Ab Phase Designs: Techniques Andmentioning
confidence: 99%