Abstract-In this paper, we address a beamforming application based on the capture of far-field speech data from a single speaker in a real meeting room. After the position of the speaker is estimated by a speaker tracking system, we construct a subband-domain beamformer in generalized sidelobe canceller (GSC) configuration. In contrast to conventional practice, we then optimize the active weight vectors of the GSC so as to obtain an output signal with maximum negentropy (MN). This implies the beamformer output should be as non-Gaussian as possible. For calculating negentropy, we consider the Γ and the generalized Gaussian (GG) pdfs. After MN beamforming, Zelinski postfiltering is performed to further enhance the speech by removing residual noise. Our beamforming algorithm can suppress noise and reverberation without the signal cancellation problems encountered in the conventional beamforming algorithms. We demonstrate this fact through a set of acoustic simulations. Moreover, we show the effectiveness of our proposed technique through a series of far-field automatic speech recognition experiments on the Multi-Channel Wall Street Journal Audio Visual Corpus (MC-WSJ-AV), a corpus of data captured with real far-field sensors, in a realistic acoustic environment, and spoken by real speakers. On the MC-WSJ-AV evaluation data, the delay-and-sum beamformer with post-filtering achieved a word error rate (WER) of 16.5%. MN beamforming with the Γ pdf achieved a 15.8% WER, which was further reduced to 13.2% with the GG pdf, whereas the simple delay-and-sum beamformer provided a WER of 17.8%. To the best of our knowledge, no lower error rates at present have been reported in the literature on this ASR task.
This paper is presented as a performance between the two authors who are discussing the notion of daydreaming as a transitional space between their research interests in dreams and the semantic associations of conscious thought. The first half concerns the logical, rational awake mind when applied to an understanding of daydreaming as a bridge between one state and another. It investigates the idea of the interactive interface as a parallel with the daydream where both enable a middle ground, or safe space for crossing-over. The difficulty highlighted through the performative reading of the paper is to keep focus without slipping into the interstitial state of a daydream. The paper gently drifts towards the second half, which explores understanding of dreamstates and the neuroscientific models of the unconscious brain.
This research project maps virtual emotions. Rauch uses 3D-surface capturing devices to scan facial expressions in (stuffed) animals and humans, which she then sculpts with the Phantom Arm/ SensAble FreeForm device in 3D virtual space. The results are rapidform printed objects and 3D animations of morphing faces and gestures. Building on her research into consciousness studies and emotions, she has developed a new artwork to reveal characteristic aspects of human emotions (i.e. laughing, crying, frowning, sneering, etc.), which utilises new technology, in particular digital scanning devices and special effects animation software. The proposal is to use a 3D high-resolution laser scanner to capture animal faces and, using the data of these faces, animate and then combine them with human emotional facial expressions. The morphing of the human and animal facial data are not merely layers of the different scans but by applying an algorithmic programme to the data, crucial landmarks in the animal face are merged in order to match with those of the human. The results are morphings of the physical characteristics of animals with the emotional characteristics of the human face in 3D. The focus of this interdisciplinary research project is a collaborative practice that brings together researchers from UCL in London and researchers at OCAD University's data and information visualization lab. Rauch uses Darwin's metatheory of the continuity of species and other theories on evolution and internal physiology (Ekman et al) in order to reexamine previous and new theories with the use of new technologies, including the SensAble FreeForm Device, which, as an interface, allows for haptic feedback from digital data.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.