In this paper, we present a complete platform for the semiautomatic and simultaneous generation of human-machine dialog applications in two different and separate modalities (Voice and Web) and several languages to provide services oriented to obtaining or modifying the information from a database (data-centered). Given that one of the main objectives of the platform is to unify the application design process regardless of its modality or language and then to complete it with the specific details of each one, the design process begins with a general description of the application, the data model, the database access functions, and a generic finite state diagram consisting of the application flow. With this information, the actions to be carried out in each state of the dialog are defined. Then, the specific characteristics of each modality and language (grammars, prompts, presentation aspects, user levels, etc.) are specified in later assistants. Finally, the scripts that execute the application in the real-time system are automatically generated.We describe each assistant in detail, emphasizing the methodologies followed to ease the design process, especially in its critical aspects. We also describe different strategies and characteristics that we have applied to provide portability, robustness, adaptability and high performance to the platform. We also address important issues in dialog applications such as mixed initiative and over-answering, confirmation handling or providing long lists of information to the user. Finally, the results obtained in a subjective evaluation with different designers and in the creation of two full applications that confirm the usability, flexibility and standardization of the platform, and provide new research directions.
We present a speech-controllable MP3 player for embedded systems. In addition to basic commands such as "next" or "repeat" one main feature of the system is the selection of titles, artists, albums, genres, or composers by speech. We will describe the implemented dialog and discuss challenges for a real-world application. The findings and considerations of the paper easily extend to general audio media.
In this paper DiaGen is presented, a tool that provides support in generating code for embedded dialogue applications. By aid of it, the dialogue development process is speeded up considerably. At the same time it is guaranteed that only well-formed and well-defined constructs are used. Having had its roots in the EU-funded project GEMINI, fundamental changes were necessary to adopt it to the requirements of the application environment. Additionally within this paper the basics of embedded speech dialogue systems are covered.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.