“…Body gestures, facial expressions, and especially finger and hand movements that require a high level of temporal and spatial accuracy are also involved in music perception. This information provides visual cues which assist the intelligibility of music (Thompson et al, 2005;Molnar-Szakacs and Overy, 2006;Repp and Knoblich, 2009;Behne and Wöllner, 2011;Platz and Kopiez, 2012;Maes et al, 2014), as has similarly been observed for speech (Klucharev et al, 2003;Schwartz et al, 2004;Van Wassenhove et al, 2005;Stekelenburg and Vroomen, 2007;Arnal et al, 2009;Pilling, 2009;Paris et al, 2013Paris et al, , 2016aBaart and Samuel, 2015;Biau and Soto-Faraco, 2015;Hsu et al, 2016). For example, in audiovisual (AV) speech a talker's facial articulations begin before the sound onset, providing a perceiver with potential cues to predict the upcoming speech sound, and thereby enhance AV speech perception relative to the audio only (Besle et al, 2004;Schwartz et al, 2004;Paris et al, 2013).…”