Birdsong provides a unique model for understanding the behavioral and neural bases underlying complex sequential behaviors. However, birdsong analyses require laborious effort to make the data quantitatively analyzable. The previous attempts had succeeded to provide some reduction of human efforts involved in birdsong segment classification. The present study was aimed to further reduce human efforts while increasing classification performance. In the current proposal, a linear-kernel support vector machine was employed to minimize the amount of human-generated label samples for reliable element classification in birdsong, and to enable the classifier to handle highly-dimensional acoustic features while avoiding the over-fitting problem. Bengalese finch's songs in which distinct elements (i.e., syllables) were aligned in a complex sequential pattern were used as a representative test case in the neuroscientific research field. Three evaluations were performed to test (1) algorithm validity and accuracy with exploring appropriate classifier settings, (2) capability to provide accuracy with reducing amount of instruction dataset, and (3) capability in classifying large dataset with minimized manual labeling. The results from the evaluation (1) showed that the algorithm is 99.5% reliable in song syllables classification. This accuracy was indeed maintained in evaluation (2), even when the instruction data classified by human were reduced to one-minute excerpt (corresponding to 300–400 syllables) for classifying two-minute excerpt. The reliability remained comparable, 98.7% accuracy, when a large target dataset of whole day recordings (∼30,000 syllables) was used. Use of a linear-kernel support vector machine showed sufficient accuracies with minimized manually generated instruction data in bird song element classification. The methodology proposed would help reducing laborious processes in birdsong analysis without sacrificing reliability, and therefore can help accelerating behavior and studies using songbirds.
Rodents' ultrasonic vocalizations (USVs) provide useful information for assessing their social behaviors. Despite previous efforts in classifying subcategories of time-frequency patterns of USV syllables to study their functional relevance, methods for detecting vocal elements from continuously recorded data have remained sub-optimal. Here, we propose a novel procedure for detecting USV segments in continuous sound data containing background noise recorded during the observation of social behavior. The proposed procedure utilizes a stable version of the sound spectrogram and additional signal processing for better separation of vocal signals by reducing the variation of the background noise. Our procedure also provides precise time tracking of spectral peaks within each syllable. We demonstrated that this procedure can be applied to a variety of USVs obtained from several rodent species. Performance tests showed this method had greater accuracy in detecting USV syllables than conventional detection methods. OPEN ACCESSCitation: Tachibana RO, Kanno K, Okabe S, Kobayasi KI, Okanoya K (2020) USVSEG: A robust method for segmentation of ultrasonic vocalizations in rodents. PLoS ONE 15(2): e0228907. https://doi.org/10.
Birdsong provides a unique model for studying the control mechanisms of complex sequential behaviors. The present study aimed to demonstrate that multiple factors affect temporal control in the song production. We analyzed the song of Bengalese finches in various time ranges to address factors that affected the duration of acoustic elements (notes) and silent intervals (gaps). The gaps showed more jitter across song renditions than did notes. Gaps had longer duration in branching points of song sequence than in stereotypic transitions, and the duration of a gap was correlated with the duration of the note that preceded the gap. When looking at the variation among song renditions, we found notable factors in three time ranges: within-day drift, within-bout changes, and local jitter. Note durations shortened over time from morning to evening. Within each song bout note durations lengthened as singing progressed, while gap durations lengthened only during the late part of song bout. Further analysis after removing these drift factors confirmed that the jitter remained in local song sequences. These results suggest distinct sources of temporal variability exist at multiple levels on the basis of this note-gap relationship, and that song comprised a mixture of these sources.
Whether face gender perception is processed by encoding holistic (whole) or featural (parts) information is a controversial issue. Although neuroimaging studies have identified brain regions related to face gender perception, the temporal dynamics of this process remain under debate. Here, we identified the mechanism and temporal dynamics of face gender perception. We used stereoscopic depth manipulation to create two conditions: the front and behind condition. In the front condition, facial patches were presented stereoscopically in front of the occluder and participants perceived them as disjoint parts (featural cues). In the behind condition, facial patches were presented stereoscopically behind the occluder and were amodally completed and unified in a coherent face (holistic cues). We performed three behavioral experiments and one electroencephalography experiment, and compared the results of the front and behind conditions. We found faster reaction times (RTs) in the behind condition compared with the front, and observed priming effects and aftereffects only in the behind condition. Moreover, the EEG experiment revealed that face gender perception is processed in the relatively late phase of visual recognition (200–285 ms). Our results indicate that holistic information is critical for face gender perception, and that this process occurs with a relatively late latency.
SUMMARY Understanding the structure and function of neural circuits underlying speech and language is a vital step toward better treatments for diseases of these systems. Songbirds, among the few animal orders that share with humans the ability to learn vocalizations from a conspecific, have provided many insights into the neural mechanisms of vocal development. However, research into vocal learning circuits has been hindered by a lack of tools for rapid genetic targeting of specific neuron populations to meet the quick pace of developmental learning. Here, we present a viral tool that enables fast and efficient retrograde access to projection neuron populations. In zebra finches, Bengalese finches, canaries, and mice, we demonstrate fast retrograde labeling of cortical or dopaminergic neurons. We further demonstrate the suitability of our construct for detailed morphological analysis, for in vivo imaging of calcium activity, and for multi-color brainbow labeling.
Precise neural sequences are associated with the production of well-learned skilled behaviors. Yet, how neural sequences arise in the brain remains unclear. In songbirds, premotor projection neurons in the cortical song nucleus HVC are necessary for producing learned song and exhibit precise sequential activity during singing. Using cell-type specific calcium imaging we identify populations of HVC premotor neurons associated with the beginning and ending of singing-related neural sequences. We characterize neurons that bookend singing-related sequences and neuronal populations that transition from sparse preparatory activity prior to song to precise neural sequences during singing. Recordings from downstream premotor neurons or the respiratory system suggest that pre-song activity may be involved in motor preparation to sing. These findings reveal population mechanisms associated with moving from non-vocal to vocal behavioral states and suggest that precise neural sequences begin and end as part of orchestrated activity across functionally diverse populations of cortical premotor neurons.
The ultrasonic vocalizations of rats can transmit affective states to listeners. For example, rats typically produce shorter calls in a higher frequency range in social situations (pleasant call: PC), whereas they emit longer calls with lower frequency in distress situations (distress call: DC). Knowing what acoustical features contribute to auditory discrimination between these two calls will help to better characterize auditory perception of vocalized sounds in rats. In turn, this could lead to better estimation of models for processing vocalizations in sensory systems in general. Here, using an operant discrimination procedure, we examined the impact of various acoustical features on discriminating emotional ultrasonic vocalizations. We did this by systematically swapping three features (frequency range, time duration, and residual frequency-modulation pattern) between two emotional calls. After rats were trained to discriminate between PC and DC, we presented probe stimuli that were synthesized calls with one or two acoustical features swapped, and examined if the rats judged these calls as either PC or DC. The results revealed that all features were important for discrimination between the two call types, but frequency range provided the most information for discrimination. This supports the hypothesis that while rats utilize all acoustical features to perceive emotional vocalizations, they considerably rely on frequency cues.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.