2023
DOI: 10.1016/j.isci.2023.108204
|View full text |Cite
|
Sign up to set email alerts
|

Beyond speech: Exploring diversity in the human voice

Andrey Anikin,
Valentina Canessa-Pollard,
Katarzyna Pisanski
et al.
Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
2
0

Year Published

2023
2023
2025
2025

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 8 publications
(9 citation statements)
references
References 50 publications
0
2
0
Order By: Relevance
“…Speech and song are produced by the same vocal tract, yet each makes distinct demands on musculature, breathing, and motor control mechanisms 13,14 , raising the possibility that certain acoustical cues could serve as markers of each category 15 . However, even though people readily distinguish speech and song, the cues underlying the categories, even within cultures, are far from clear 11,1619 , so that such a claim is difficult to address. Indeed, even if speech and singing reliably exist as separate, recognizable entities, their cognitive representation could depend mostly on learned regularities that are particular to each cultural group.…”
Section: Introductionmentioning
confidence: 99%
“…Speech and song are produced by the same vocal tract, yet each makes distinct demands on musculature, breathing, and motor control mechanisms 13,14 , raising the possibility that certain acoustical cues could serve as markers of each category 15 . However, even though people readily distinguish speech and song, the cues underlying the categories, even within cultures, are far from clear 11,1619 , so that such a claim is difficult to address. Indeed, even if speech and singing reliably exist as separate, recognizable entities, their cognitive representation could depend mostly on learned regularities that are particular to each cultural group.…”
Section: Introductionmentioning
confidence: 99%
“…Next, Anikin et al ( 82 ) curated a different global recording dataset, including not only song and speech but also various nonverbal vocalizations (e.g., laughs, cries, and screams). Their analyses using spectrotemporal modulations also confirmed lower pitch in speech and steadier notes in singing.…”
Section: Discussionmentioning
confidence: 99%
“…In the absence of suitable datasets for doing so, we tested at least the inter-rater reliability with which several trained raters performed manual annotation of NLP episodes. Specifically, we asked the attendants of the NLP workshop in St. Etienne in June 2023 to note all NLP episodes in a randomly selected subset of 23 vocalizations from a published corpus, all of which were reported as containing some NLP in the original publication [21]. The recordings included 10 human nonverbal vocalizations (5F + 5M), 10 speech samples (5F + 5M), and three samples of a cappella singing (2F + 1M); the duration varied from 2 to 10 s. Ten raters independently annotated four NLP types (frequency jumps, sidebands, subharmonics, and chaos).…”
Section: Nlp Annotation and Quantificationmentioning
confidence: 99%
“…As an exemplary check of NLP specificity, we calculated a variety of acoustic features (generic, NLP-specific, and derived from nonlinear time series analysis), frame by frame, in 5000 fully synthetic vocalizations (with ground truth of NLP presence and type known a priori), as well as in 1518 audio recordings of human nonverbal vocalizations, singing, and speech from [21] with a total duration of two hours and nearly 300,000 overlapping STFT frames 50 ms each (with NLP annotated manually). We then compared the values of each acoustic feature in STFT frames depending on the presence and type of NLP (see vignette analysis_any-NLP).…”
Section: Nlp Annotation and Quantificationmentioning
confidence: 99%
See 1 more Smart Citation