In this paper, we put forward a computational framework for the comparison between motor, auditory and perceptuo-motor theories of speech communication. We first recall the basic arguments of these three sets of theories, either applied to speech perception or to speech production. Then we expose a unifying Bayesian model able to express each theory in a probabilistic way. Focusing on speech perception, we demonstrate that under two hypotheses, regarding communication noise and inter-speaker variability, providing perfect conditions for speech communication, motor and auditory theories are indistinguishable. We then degrade successively each hypothesis to study the distinguishability of the different theories in "adverse" conditions. We first present simulations on a simplified implementation of the model with monodimensional sensory and motor variables, and secondly we consider a simulation of the human vocal tract providing more realistic auditory and articulatory variables. Simulation results allow us to emphasize the respective roles of motor and auditory knowledge in various conditions of speech perception in adverse conditions, and to suggest some guidelines for future studies aiming at assessing the role of motor knowledge in speech perception.
There is a consensus concerning the view that both auditory and motor representations intervene in the perceptual processing of speech units. However, the question of the functional role of each of these systems remains seldom addressed and poorly understood. We capitalized on the formal framework of Bayesian Programming to develop COSMO (Communicating Objects using Sensory-Motor Operations), an integrative model that allows principled comparisons of purely motor or purely auditory implementations of a speech perception task and tests the gain of efficiency provided by their Bayesian fusion. Here, we show 3 main results: (a) In a set of precisely defined “perfect conditions,” auditory and motor theories of speech perception are indistinguishable; (b) When a learning process that mimics speech development is introduced into COSMO, it departs from these perfect conditions. Then auditory recognition becomes more efficient than motor recognition in dealing with learned stimuli, while motor recognition is more efficient in adverse conditions. We interpret this result as a general “auditory-narrowband versus motor-wideband” property; and (c) Simulations of plosive-vowel syllable recognition reveal possible cues from motor recognition for the invariant specification of the place of plosive articulation in context that are lacking in the auditory pathway. This provides COSMO with a second property, where auditory cues would be more efficient for vowel decoding and motor cues for plosive articulation decoding. These simulations provide several predictions, which are in good agreement with experimental data and suggest that there is natural complementarity between auditory and motor processing within a perceptuo-motor theory of speech perception.
As the physical limits of Moore's law are being reached, a research effort is launched to achieve further performance improvements by exploring computation paradigms departing from standard approaches. The BAMBI project (Bottomup Approaches to Machines dedicated to Bayesian Inference) aims at developing hardware dedicated to probabilistic computation, which extends logic computation realised by boolean gates in current computer chips. Such probabilistic computing devices would allow to solve faster and at a lower energy cost a wide range of Artificial Intelligence applications, especially when decisions need to be taken from incomplete data in an uncertain environment. This paper describes an architecture where very simple operators compute on a time coding of probability values as stochastic signals. Simulation tests and a reconfigurable logic hardware implementation demonstrated the feasibility and performances of the proposed inference machine. Hardware results show this architecture can quickly solve Bayesian sensor fusion problems and is very efficient in terms of energy consumption.
Abstract-Compared to conventional processors, stochastic computing architectures have strong potential to speed up computation time and to reduce power consumption. We present such an architecture, called Bayesian Machine (BM), dedicated to solving Bayesian inference problems. Given a set of noisy signals provided by low-level sensors, a BM estimates the posterior probability distribution of an unknown target information. In the present study, a BM is used to solve a sound source localization (SSL) problem: the BM computes the probability distribution of the position of a sound source given acoustic signals captured by a set of microphones. Assuming free field wave propagation (no reverberations), we express the SSL problem as the maximization of a likelihood function fed with audio features provided by the time-frequency (TF) analysis of the captured audio waves. The proposed BM uses bitwise parallel sampling to fuse the resulting multi-channel information. As the number of channels to fuse is large, the standard BM architecture encounters the so-called "time dilution problem" (long delays are necessary to obtain valid samples). We tackle this problem by using max-normalization of the distributions combined with a periodic re-sampling of the bit streams after processing a reasonably small subset of evidences. Finally, we compare the localization performance of the proposed machine with the results obtained using a standard version of the machine. The re-sampling leads to an impressive acceleration factor of 10 3 in the computation.
No abstract
In recent years, stochastic computing became popular in Bayesian circuits implementation because it enables compact and low power architectures. These architectures use Stochastic Number Generators (SNGs) that encode data in random bit-streams. SNGs are composed of Random Number Generators (RNGs) which contribute significantly to the circuit area and power consumption: according to our measurements, up to 29% of the area and 85% of the circuit power consumption, excluding memories. In this paper, we compare SNG implementations in terms of accuracy, area and power consumption. Furthermore, we propose a new SNG architecture that uses a single RNG for the whole design in order to generate the required stochastic bit-streams. The proposed architecture allows to save up to 11% of area and 58% of power consumption compared to the state of the art, with no significant accuracy loss.
International audienceThis work revisits the stochastic computing paradigm as a way to implement architectures dedicated to probabilistic inference. In general, it is assumed the operation over stochastic bit streams is robust with respect to radiation transient events effects. Moreover, it can be expected that leveraging the stochastic computing paradigm to implement probabilistic computations such as Bayesian inference implemented in hardware, could yield an increased resilience to radiation effects comparatively to deterministic procedures. However, the practical assessment of the robustness against radiation is mandatory before considering Stochastic Bayesian Machines (SBMs) in hazardous environments. Results of fault injection campaigns at RTL level provide the first evidences of the intrinsic robustness of SBMs with respect to SEUs and SETs
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.