Speech produced in the presence of noise (Lombard speech) is typically more intelligible than speech produced in quiet (plain speech) when presented at the same signal-to-noise ratio, but the factors responsible for the Lombard intelligibility benefit remain poorly understood. Previous studies have demonstrated a clear effect of spectral differences between the two speech styles and a lack of effect of fundamental frequency differences. The current study investigates a possible role for durational differences alongside spectral changes. Listeners identified keywords in sentences manipulated to possess either durational or spectral characteristics of plain or Lombard speech. Durational modifications were produced using linear or nonlinear time warping, while spectral changes were applied at the global utterance level or to individual time frames. Modifications were made to both plain and Lombard speech. No beneficial effects of durational increases were observed in any condition. Lombard sentences spoken at a speech rate substantially slower than their plain counterparts also failed to reveal a durational benefit. Spectral changes to plain speech resulted in large intelligibility gains, although not to the level of Lombard speech. These outcomes suggest that the durational increases seen in Lombard speech have little or no role in the Lombard intelligibility benefit.
In this paper, we propose spatio-temporal silhouette representations, called silhouette energy image (SEI) and silhouette history image (SHI) to characterize motion and shape properties for recognition of human movements such as human actions, activities in daily life. The SEI and SHI are constructed by using the silhouette image sequence of an action. The span or difference of the end time and start time is used to make the SHI. For addressing the human shape variability, we used the variation of the anthropometry of the person. We extract the features based on geometric shape moments. We tested our approach successfully in the indoor and outdoor environment. Our experimental results show that the proposed method of human action recognition is robust, flexible and efficient.
Changes in frequency such as those found in Risset tones have been associated with moving sound sources in the vertical plane (Pratt effect) and the horizontal plane (Doppler illusion). We investigated the reported origin and motion of unspatialized Risset tones presented monotically and diotically, and Risset tones simulated to be in the sagittal or coronal plane, approaching or receding, from above or horizontally. Independent of the artificial spatialization used (none, spatializing frequency components collectively or individually, elevated or not), upward glissandi were more likely to be judged as approaching than receding, and downward glissandi as receding than approaching, in most cases from the horizon. Glissandi associations with horizontal movements were more common in stimuli simulated on the sagittal plane than in stimuli simulated on the coronal plane. These findings suggest that the Doppler illusion is stronger than the Pratt effect, at least for Risset tones presented over headphones and simulated to be in the sagittal plane. These findings may contribute to better understanding of the association between auditory motion perception and changes in frequency.
In two artificial language learning experiments with four groups of highly proficient Basque-Spanish bilinguals and two groups of Spanish monolinguals, we examine the cues that allow adult listeners to parse new input into phrases. In addition, we investigate which factors lead bilinguals to switch between the segmentation strategies characteristic of their two languages. We show that segmental information drives bilinguals' choice of a segmentation strategy when presented with an unfamiliar language. The language in which bilinguals are addressed during the study (i.e., the language of context) additionally modulates their segmentation preference, and this context language effect is found in L1Basque bilinguals but does not extend to L1Spanish bilinguals. The cause of this asymmetry is yet to be established. Finally, we show that adult monolinguals disregard statistical cues in favor of unfamiliar segmental information when in conflict. These results evidence that the available phrase segmentation cues are arranged hierarchically.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.