2020
DOI: 10.1037/cap0000220
|View full text |Cite
|
Sign up to set email alerts
|

Measurement practices in large-scale replications: Insights from Many Labs 2.

Abstract: Validity of measurement is integral to the interpretability of research endeavours and any subsequent replication attempts. To assess current measurement practices and the construct validity of measures in large-scale replication studies, we conducted a systematic review of measures used in Many Labs 2: Investigating Variation in Replicability Across Samples and Settings (Klein et al., 2018). To evaluate the psychometric properties of the scales used in Many Labs 2 we conducted factor and reliability analyses … Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
26
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
7
2

Relationship

2
7

Authors

Journals

citations
Cited by 40 publications
(28 citation statements)
references
References 55 publications
1
26
0
Order By: Relevance
“…This has been documented in literature on emotions (69% of scales sampled; Weidman et al, 2017), education and behavior (40–90% of articles sampled; Barry et al, 2014), and social psychology and personality (40% of scales sampled; Flake et al, 2017). Most recently, Shaw et al (2020) reviewed all of the measures used in the Many Labs 2 studies; 34 of the 43 (79%) item-based scales appeared to be ad hoc, having been created by the study authors and used without supporting validity information. If a study uses undeveloped measures with no validity evidence, many questions remain, and the validity of the study conclusions are cast in serious doubt.…”
Section: Using Questions That Promote Validity Of Measure Usementioning
confidence: 99%
See 1 more Smart Citation
“…This has been documented in literature on emotions (69% of scales sampled; Weidman et al, 2017), education and behavior (40–90% of articles sampled; Barry et al, 2014), and social psychology and personality (40% of scales sampled; Flake et al, 2017). Most recently, Shaw et al (2020) reviewed all of the measures used in the Many Labs 2 studies; 34 of the 43 (79%) item-based scales appeared to be ad hoc, having been created by the study authors and used without supporting validity information. If a study uses undeveloped measures with no validity evidence, many questions remain, and the validity of the study conclusions are cast in serious doubt.…”
Section: Using Questions That Promote Validity Of Measure Usementioning
confidence: 99%
“…When scientists lack validity evidence for measures, they lack the necessary information to evaluate the overall validity of a study’s conclusions. Further, recent research on commonly used measures in social and personality psychology showed that measures with less published validity evidence were less likely to show strong evidence for construct validity when evaluated in new data (Hussey & Hughes, 2020; Shaw, Cloos, Luong, Elbaz, & Flake, 2020). The lack of information about measures is a critical problem that could stem from underreporting, ignorance, negligence, misrepresentation, or some combination of these factors.…”
mentioning
confidence: 99%
“…The measures of problematic SMU often lack construct validity and may produce inflated effect sizes. For example, scales of problematic SMU often entail "negative outcomes" as a diagnostic criterion and may prime respondents to reporting negative effects (e.g., Mieczkowski et al, 2020;Shaw et al, 2020). Purely descriptive measures of SMU, on the other hand, can be equally problematic because depending on the scope of the researcher, the same variable (e.g., the objective number of followers) can be used as a proxy for popularity or anxiety (cf.…”
Section: Operationalization Of Social Media Usementioning
confidence: 99%
“…For example, with psychometrically sound scales our alternative ways of computing scale composites (unweighted average score across items, sum score, or first component from PCA) should have been very nearly equivalent (e.g., McNeish and Wolf 2020). Yet they resulted in a considerable UMV, which may reflect inadequate attention to psychometric scale development amongst our (non-random) sample of social and cognitive multi-lab studies (see also Shaw et al 2020). That said, these multi-lab projects were replication projects, and there is an understandable tension between exact replication and design improvements.…”
Section: /34mentioning
confidence: 99%