This study highlights the congruence between objective and subjective measurements of vocal accuracy within this first time comparison. Our results confirm the relevance of the pitch interval deviation criterion in vocal accuracy assessment. Furthermore, the number of tonality modulations is also a salient criterion in perceptive rating and should be taken into account in studies using acoustic analyses.
Similarities and differences between speech and song are often examined. However, the perceptual definition of these two types of vocalization is challenging. Indeed, the prototypical characteristics of speech or song support top-down processes, which influence listeners' perception of acoustic information. In order to examine vocal features associated with speaking and singing, we propose an innovative approach designed to facilitate bottom-up mechanisms in perceiving vocalizations by using material situated between speech and song: Speechsong. 25 participants were asked to evaluate 20 performances of a speechsong composition by Arnold Schoenberg, “Pierrot lunaire” op. 21 from 1912, evaluating 20 features of vocal-articulatory expression. Raters provided reliable judgments concerning the vocal features used by the performers and did not show strong appeal or specific expectations in reference to Schoenberg's piece. By examining the relationship between the vocal features and the impression of song or speech, the results confirm the importance of pitch (height, contour, range), but also point to the relevance of register, timbre, tension and faucal distance. Besides highlighting vocal features associated with speech and song, this study supports the relevance of the present approach of focusing on a theoretical middle category in order to better understand vocal expression in song and speech.
An important aspect of the perceived quality of vocal music is the degree to which the vocalist sings in tune. Although most listeners seem sensitive to vocal mistuning, little is known about the development of this perceptual ability or how it differs between listeners. Motivated by a lack of suitable preexisting measures, we introduce in this article an adaptive and ecologically valid test of mistuning perception ability. The stimulus material consisted of short excerpts (6 to 12 s in length) from pop music performances (obtained from MedleyDB; Bittner et al., 2014 ) for which the vocal track was pitch-shifted relative to the instrumental tracks. In a first experiment, 333 listeners were tested on a two-alternative forced choice task that tested discrimination between a pitch-shifted and an unaltered version of the same audio clip. Explanatory item response modeling was then used to calibrate an adaptive version of the test. A subsequent validation experiment applied this adaptive test to 66 participants with a broad range of musical expertise, producing evidence of the test’s reliability, convergent validity, and divergent validity. The test is ready to be deployed as an experimental tool and should make an important contribution to our understanding of the human ability to judge mistuning. Electronic supplementary material The online version of this article (10.3758/s13428-019-01225-1) contains supplementary material, which is available to authorized users.
The inability to vocally match a pitch can be caused by poor pitch perception or by poor vocal-motor control. Although previous studies have tried to examine the relationship between pitch perception and vocal production, they have failed to control for the timbre of the target to be matched. In the present study, we compare pitch-matching accuracy with an unfamiliar instrument (the slider) and with the voice, designed such that the slider plays back recordings of the participant's own voice. We also measured pitch accuracy in singing a familiar melody ("Happy Birthday") to assess the relationship between single-pitch-matching tasks and melodic singing. Our results showed that participants (all nonmusicians) were significantly better at matching recordings of their own voices with the slider than with their voice, indicating that vocal-motor control is an important limiting factor on singing ability. We also found significant correlations between the ability to sing a melody in tune and vocal pitch matching, but not pitch matching on the slider. Better melodic singers also tended to have higher quality voices (as measured by acoustic variables). These results provide important evidence about the role of vocal-motor control in poor singing ability and demonstrate that single-pitch-matching tasks can be useful in measuring general singing abilities.
In recent years there has been a remarkable increase in research focusing on deficits of pitch production in singing. A critical concern has been the identification of “poor pitch singers,” which we refer to more generally as individuals having a “vocal pitch imitation deficit.” The present paper includes a critical assessment of the assumption that vocal pitch imitation abilities can be treated as a dichotomy. Though this practice may be useful for data analysis and may be necessary within educational practice, we argue that this approach is complicated by a series of problems. Moreover, we argue that a more informative (and less problematic) approach comes from analyzing vocal pitch imitation abilities on a continuum, referred to as effect magnitude regression, and offer examples concerning how researchers may analyze data using this approach. We also argue that the understanding of this deficit may be better served by focusing on the effects of experimental manipulations on different individuals, rather than attempt to treat values of individual measures, and isolated tasks, as absolute measures of ability.
This study aims to validate our method for measuring accuracy in a melodic context. We analysed the popular song 'Happy Birthday' sung by 63 occasional and 14 professional singers thanks to AudioSculpt and OpenMusic (IRCAM, Paris, France). In terms of evaluation of the pitch interval deviation, we replicated the profi le of occasional singers described in the literature (the slower the performance, the more accurate it is). Our results also confi rm that the professional singers sing more accurately than occasional singers but not when a Western operatic singing technique is involved. These results support the relevance of our method for analysing vocal accuracy of occasional and professional singers and led us to discuss adaptations to be implemented for analysing the accuracy of operatic voices.
Vocalizations including laughter, cries, moans, or screams constitute a potent source of information about the affective states of others. It is typically conjectured that the higher the intensity of the expressed emotion, the better the classification of affective information. However, attempts to map the relation between affective intensity and inferred meaning are controversial. Based on a newly developed stimulus database of carefully validated non-speech expressions ranging across the entire intensity spectrum from low to peak, we show that the intuition is false. Based on three experiments (N = 90), we demonstrate that intensity in fact has a paradoxical role. Participants were asked to rate and classify the authenticity, intensity and emotion, as well as valence and arousal of the wide range of vocalizations. Listeners are clearly able to infer expressed intensity and arousal; in contrast, and surprisingly, emotion category and valence have a perceptual sweet spot: moderate and strong emotions are clearly categorized, but peak emotions are maximally ambiguous. This finding, which converges with related observations from visual experiments, raises interesting theoretical challenges for the emotion communication literature.
What, if any, similarities and differences between song and speech are consistent across cultures? Both song and speech are found in all known human societies and are argued to share evolutionary roots and cognitive resources, yet no studies have compared similarities and differences between song and speech across languages on a global scale. We will compare sets of matched song/speech recordings produced by our 81 coauthors whose 1st/heritage languages span 23 language families. Each recording set consists of singing, recited lyrics, and spoken description, plus an optional instrumental version of the sung melody to allow us to capture a “musi-linguistic continuum” from instrumental music to naturalistic speech. Our literature review and pilot analysis using five audio recording sets (by speakers of Japanese, English, Farsi, Yoruba, and Marathi) led us to make six predictions for confirmatory analysis comparing song vs. spoken descriptions: three consistent differences and three consistent similarities. For differences, we predict that: 1) songs will have higher pitch than speech, 2) songs will be slower than speech, and 3) songs will have more stable pitch than speech. For similarities, we predict that 4) pitch interval size, 5) timbral brightness, and 6) pitch declination will be similar for song and speech. Because our opportunistic language sample (approximately half are Indo-European languages) and unusual design involving coauthors as participants (approximately 1/5 of coauthors had some awareness of our hypotheses when we recorded our singing/speaking) could affect our results, we will include robustness analyses to ensure our conclusions are robust to these biases, should they exist. Other features (e.g., rhythmic isochronicity, loudness) and comparisons involving instrumental melodies and recited lyrics will be investigated through post-hoc exploratory analyses. Our sample size of n=80 people providing sung/spoken recordings already exceeds the required number of recordings (i.e. 60) to achieve 95% power with the alpha level of 0.05 for the hypothesis testing of the selected six features. Our study will provide diverse cross-linguistic empirical evidence regarding the existence of cross-cultural regularities in song and speech, shed light on factors shaping humanity’s two universal vocal communication forms, and provide rich cross-cultural data to generate new hypotheses and inform future analyses of other factors (e.g., functional context, sex, age, musical/linguistic experience) that may shape global musical and linguistic diversity.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.