2019
DOI: 10.1038/s41467-019-11605-y
|View full text |Cite
|
Sign up to set email alerts
|

Parallels in the sequential organization of birdsong and human speech

Abstract: Human speech possesses a rich hierarchical structure that allows for meaning to be altered by words spaced far apart in time. Conversely, the sequential structure of nonhuman communication is thought to follow non-hierarchical Markovian dynamics operating over only short distances. Here, we show that human speech and birdsong share a similar sequential structure indicative of both hierarchical and Markovian organization. We analyze the sequential dynamics of song from multiple songbird species and speech from … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

6
126
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
4
4
1

Relationship

1
8

Authors

Journals

citations
Cited by 75 publications
(151 citation statements)
references
References 55 publications
(90 reference statements)
6
126
0
Order By: Relevance
“…Given the artificial nature of optogenetic stimulation, we wondered whether USVs elicited by 167 optogenetic activation of POA neurons were acoustically similar to the USVs that are normally 168 produced by mice during social interactions. To compare the acoustic features of 169 optogenetically-elicited USVs (opto-USVs) to those of USVs produced spontaneously to a 170 nearby female, we employed a recently described method using variational autoencoders (VAEs) (Goffinet, 2019;Sainburg et al, 2019). Briefly, the VAE is an unsupervised modeling 172 approach that uses spectrograms of vocalizations as inputs and from these data learns a pair of 173 probabilistic maps, an "encoder" and a "decoder," capable of compressing vocalizations into a 174 small number of latent features while attempting to preserve as much information as possible 175 ( Fig.…”
Section: Acoustic Characterization Of Usvs Elicited By Activation Of mentioning
confidence: 99%
“…Given the artificial nature of optogenetic stimulation, we wondered whether USVs elicited by 167 optogenetic activation of POA neurons were acoustically similar to the USVs that are normally 168 produced by mice during social interactions. To compare the acoustic features of 169 optogenetically-elicited USVs (opto-USVs) to those of USVs produced spontaneously to a 170 nearby female, we employed a recently described method using variational autoencoders (VAEs) (Goffinet, 2019;Sainburg et al, 2019). Briefly, the VAE is an unsupervised modeling 172 approach that uses spectrograms of vocalizations as inputs and from these data learns a pair of 173 probabilistic maps, an "encoder" and a "decoder," capable of compressing vocalizations into a 174 small number of latent features while attempting to preserve as much information as possible 175 ( Fig.…”
Section: Acoustic Characterization Of Usvs Elicited By Activation Of mentioning
confidence: 99%
“…Since Shannon's original work characterizing the sequential dependencies present in language, the structure underlying long-range information in language has been the subject of a great deal of interest in linguistics, statistical physics, cognitive science, and psychology [1][2][3][4][5][6][7][8][9][10][11][12][13][14][15][16][17][18][19][20]. Long-range information content refers to the dependencies between discrete elements (e.g., units of spoken or written language) that persist over long sequential distances spanning words, phrases, sentences, and discourse.…”
Section: Introductionmentioning
confidence: 99%
“…Using the responses of these participants, Shannon derived an upper bound on the information added by including each preceding letter in the sequence. More recent investigations compute statistical dependencies directly from language corpora using either correlation functions [3,4,7,8,10,12,13] or mutual information (MI) functions [2,5,6,14] between elements in a sequence. In both cases, sequential On average, as the distance between elements increases, statistical dependencies grow weaker.…”
Section: Introductionmentioning
confidence: 99%
“…This data-driven approach is closely related to previous studies that have applied autoencoding to birdsong for purposes of generating spectrograms and interpolating syllables for use in playback experiments [38, 44]. Additionally, dimensionality reduction algorithms such as the UMAP [29] and t-SNE [27] algorithms we use here to visualize latent spaces have previously been applied to raw spectrograms of birdsong syllables to aid in syllable clustering [37] and to visualize juvenile song learning [25]. Here, by contrast, we use the VAE as a general-purpose tool for quantifying vocal behavior, with a focus on cross-species comparisons and assessing variability across groups, individuals, and experimental conditions.…”
Section: Discussionmentioning
confidence: 96%