During the approximately 18-32 thousand years of domestication, dogs and humans have shared a similar social environment. Dog and human vocalizations are thus familiar and relevant to both species, although they belong to evolutionarily distant taxa, as their lineages split approximately 90-100 million years ago. In this first comparative neuroimaging study of a nonprimate and a primate species, we made use of this special combination of shared environment and evolutionary distance. We presented dogs and humans with the same set of vocal and nonvocal stimuli to search for functionally analogous voice-sensitive cortical regions. We demonstrate that voice areas exist in dogs and that they show a similar pattern to anterior temporal voice areas in humans. Our findings also reveal that sensitivity to vocal emotional valence cues engages similarly located nonprimary auditory regions in dogs and humans. Although parallel evolution cannot be excluded, our findings suggest that voice areas may have a more ancient evolutionary origin than previously known.
We investigated neural mechanisms that support voice recognition in a training paradigm with fMRI. The same listeners were trained on different weeks to categorize the mid-regions of voice-morph continua as an individual's voice. Stimuli implicitly defined a voice-acoustics space, and training explicitly defined a voice-identity space. The pre-defined centre of the voice category was shifted from the acoustic centre each week in opposite directions, so the same stimuli had different training histories on different tests. Cortical sensitivity to voice similarity appeared over different time-scales and at different representational stages. First, there were short-term adaptation effects: increasing acoustic similarity to the directly preceding stimulus led to haemodynamic response reduction in the middle/posterior STS and in right ventrolateral prefrontal regions. Second, there were longer-term effects: response reduction was found in the orbital/insular cortex for stimuli that were most versus least similar to the acoustic mean of all preceding stimuli, and, in the anterior temporal pole, the deep posterior STS and the amygdala, for stimuli that were most versus least similar to the trained voice-identity category mean. These findings are interpreted as effects of neural sharpening of long-term stored typical acoustic and category-internal values. The analyses also reveal anatomically separable voice representations: one in a voice-acoustics space and one in a voice-identity space. Voice-identity representations flexibly followed the trained identity shift, and listeners with a greater identity effect were more accurate at recognizing familiar voices. Voice recognition is thus supported by neural voice spaces that are organized around flexible 'mean voice' representations.
During speech processing, human listeners can separately analyze lexical and intonational cues to arrive at a unified representation of communicative content. The evolution of this capacity can be best investigated by comparative studies. Using functional magnetic resonance imaging, we explored whether and how dog brains segregate and integrate lexical and intonational information. We found a left-hemisphere bias for processing meaningful words, independently of intonation; a right auditory brain region for distinguishing intonationally marked and unmarked words; and increased activity in primary reward regions only when both lexical and intonational information were consistent with praise. Neural mechanisms to separately analyze and integrate word meaning and intonation in dogs suggest that this capacity can evolve in the absence of language.
Humans excel at assessing conspecific emotional valence and intensity, based solely on non-verbal vocal bursts that are also common in other mammals. It is not known, however, whether human listeners rely on similar acoustic cues to assess emotional content in conspecific and heterospecific vocalizations, and which acoustical parameters affect their performance. Here, for the first time, we directly compared the emotional valence and intensity perception of dog and human non-verbal vocalizations. We revealed similar relationships between acoustic features and emotional valence and intensity ratings of human and dog vocalizations: those with shorter call lengths were rated as more positive, whereas those with a higher pitch were rated as more intense. Our findings demonstrate that humans rate conspecific emotional vocalizations along basic acoustic rules, and that they apply similar rules when processing dog vocal expressions. This suggests that humans may utilize similar mental mechanisms for recognizing human and heterospecific vocal emotions.
There is an ongoing need to improve animal models for investigating human behavior and its biological underpinnings. The domestic dog (Canis familiaris) is a promising model in cognitive neuroscience. However, before it can contribute to advances in this field in a comparative, reliable, and valid manner, several methodological issues warrant attention. We review recent non-invasive canine neuroscience studies, primarily focusing on (i) variability among dogs and between dogs and humans in cranial characteristics, and (ii) generalizability across dog and dog-human studies. We argue not for methodological uniformity but for functional comparability between methods, experimental designs, and neural responses. We conclude that the dog may become an innovative and unique model in comparative neuroscience, complementing more traditional models.
Background and aims: Dogs have recently become an important model species for comparative social and cognitive neuroscience. Brain template-related label maps are essential for functional magnetic resonance imaging (fMRI) data analysis, to localize neural responses. In this study, we present a detailed, individual-based, T1-weighted MRI-based brain label map used in dog neuroimaging analysis. Methods: A typical, medium-headed dog (a 7.5-year-old male Golden Retriever) was selected from a cohort of 22 dogs, based on brain morphology (shape, size, and gyral pattern), to serve as the template for a label map. Results: Eighty-six 3-dimensional labels were created to highlight the main cortical (cerebral gyri on the lateral and medial side) and subcortical (thalamus, caudate nucleus, amygdala, and hippocampus) structures of the prosencephalon and diencephalon, and further main parts of brainstem (mesencephalon and rhombencephalon). Discussion: Importantly, this label map is (a) considerably more detailed than any available dog brain template; (b) it is easy to use with freeware and commercial neuroimaging software for MRI and fMRI analysis; and (c) it can be registered to other existing templates, including a recent average-based dog brain template. Using the coordinate system and label map proposed here can enhance precision and standard localization during future canine neuroimaging studies.This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited, a link to the CC License is provided, and changesif anyare indicated. (SID_1)
The social significance of recognizing the person who talks to us is obvious, but the neural mechanisms that mediate talker identification are unclear. Regions along the bilateral superior temporal sulcus (STS) and the inferior frontal cortex (IFC) of the human brain are selective for voices, and they are sensitive to rapid voice changes. Although it has been proposed that voice recognition is supported by prototype-centered voice representations, the involvement of these category-selective cortical regions in the neural coding of such "mean voices" has not previously been demonstrated. Using fMRI in combination with a voice identity learning paradigm, we show that voice-selective regions are involved in the mean-based coding of voice identities. Voice typicality is encoded on a supra-individual level in the right STS along a stimulus-dependent, identity-independent (i.e., voice-acoustic) dimension, and on an intra-individual level in the right IFC along a stimulus-independent, identity-dependent (i.e., voice identity) dimension. Voice recognition therefore entails at least two anatomically separable stages, each characterized by neural mechanisms that reference the central tendencies of voice categories.
Number of figures and tables (separately): 4 and 1 (not inlcuding extended data figures and tables
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.