Due to hard-and software progress applications based on sound enhancement are gaining popularity. But such applications are often still limited by hardware costs, energy and real-time constraints, thereby bounding the available complexity. One task often accompanied with (multichannel) sound enhancement is the localization of the sound source. This paper focusses on implementing an accurate Sound Source Localizer (SSL) for estimating the position of a sound source on a digital signal processor, using as less CPU resources as possible. One of the least complex algorithms for SSL is a simple correlation, implemented in the frequency-domain for efficiency, combined with a frequency bin weighing for robustness. Together called Generalized Cross Correlation (GCC). One popular weighing called GCC PHAse Transform (GCC-PHAT) will be handled. In this paper it is explained that for small microphone arrays this frequency-domain implementation is inferior to its time-domain alternative in terms of algorithmic complexity. Therefore a time-domain PHAT equivalent will be described. Both implementations are compared in terms of complexity (clock cycles needed on a Texas Instruments C5515 DSP) and obtained results, showing a complexity gain with a factor of 146, with hardly any loss in localization accuracy.
This work examines the use of a Wireless Acoustic Sensor Network (WASN) for the classification of clinically relevant activities of daily living (ADL) of elderly people. The aim of this research is to automatically compile a summary report about the performed ADLs which can be easily interpreted by caregivers. In this work, the classification performance of the WASN will be evaluated in both clean and noisy conditions. Results indicate that the classification performance of the WASN is 75.3±4.3% on clean acoustic data selected from the node receiving with the highest SNR. By incorporating spatial information extracted by the WASN, the classification accuracy further increases to 78.6±1.4%. In addition, the classification performance of the WASN in noisy conditions is in absolute average 8.1% to 9.0% more accurate compared to highest obtained single microphone results.
This paper gives an overview of research within the ALADIN project, which aims to develop an assistive vocal interface for people with a physical impairment. In contrast to existing approaches, the vocal interface is trained by the end-user himself, which means it can be used with any vocabulary and grammar, and that it is maximally adapted to the -possibly dysarthric -speech of the user. This paper describes the overall learning framework, the user-centred design and evaluation aspects, database collection and approaches taken to combat problems such as noise and erroneous input.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.