Truly universal helper robots capable of coping with unknown, unstructured environments must be capable of spatial reasoning, i.e., establishing geometric relations between objects and locations, expressing those in terms understandable by humans. It is therefore desirable that spatial and semantic environment representations are tightly interlinked. 3D robotic mapping and the generation of consistent metric representations of space is highly useful for navigation and exploration, but they do not capture symbol-level information about the environment. This is, however, essential for reasoning, and enables interaction via natural language, which is arguably the most common and natural communication channel used and understood by humans.This article presents a review of research in three major fields relevant for this discussion of spatial reasoning and interaction. Firstly, dialogue systems are an integral part of modern approaches to situated human-robot interaction. Secondly, interactive robots must be equipped with environment representations and reasoning methods that is suitable for both navigation and task fulfillment, as well as for interaction with human partners. Thirdly, at the interface between these domains are systems that ground language in systemic environment representation and which allow the integration of information from natural language descriptions into robotic maps. For each of these areas, important approaches are outlined and relations between the fields are highlighted, and challenging applications as well as open problems are discussed.
Segmentation of speech signals is a crucial task in many types of speech analysis. We present a novel approach at segmentation on a syllable level, using a Bidirectional Long-Short-Term Memory Neural Network. It performs estimation of syllable nucleus positions based on regression of perceptually motivated input features to a smooth target function. Peak selection is performed to attain valid nuclei positions. Performance of the model is evaluated on the levels of both syllables and the vowel segments making up the syllable nuclei. The general applicability of the approach is illustrated by good results for two common databases-Switchboard and TIMIT-for both read and spontaneous speech, and a favourable comparison with other published results.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.