“…We established the main effect of dataset type on recognition accuracy F(2, 7317) = 113.92, p < .001, η 2 p = .033, with AFEW containing the least detectable emotions (M = 0.40, 95% CI [0.38, 0.42], p < .001), and RAVDESS containing the most detectable emotions M = 0.57, 95% CI [0.56, 0.59], p < .001. Consistent with the results of other studies [3], accuracy scores for emotion labels were higher for acted facial expressions (RAVDESS M=0.57, 95% CI [0.56, 0.59]; SAVEE M = 0.54, 95% CI [0.52, 0.56])in comparison with more challenging 'in the wild' expressions (AFEW M =0.40, 95% CI [0.38, 0.42], p < .001). Significant interaction was revealed between algorithm type and dataset type F(6, 7317) = 7.45, p < .001, η 2 p = .007.…”