Abstract-Independent Vector Analysis is a powerful tool for estimating the broadband acoustic transfer function between multiple sources and the microphones in the frequency domain. In this work, we consider an extended IVA model which adopts the concept of pilot dependent signals. Without imposing any constraint on the de-mixing system, pilot signals depending on the target source are injected into the model enforcing the permutation of outputs to be consistent over time. A neural network trained on acoustic data and a lip motion detection are jointly used to produce a multimodal pilot signal dependent on the target source. It is shown through experimental results that this structure allows the enhancement of a predefined target source in very difficult and ambiguous scenarios.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.