With the development of ubiquitous computing, entering text on HMDs and smart TVs using handheld touchscreen devices (e.g., smartphone and controller) is becoming more and more attractive. In these indirect touch scenarios, the touch input surface is decoupled from the visual display. Compared with direct touch input, entering text using a keyboard in indirect touch is more challenging because before the finger touch, no visual feedback is available for locating the touch finger. Aiming at this problem, in this paper, we investigate the feasibility of gesture typing for indirect touch since keeping the finger in touch with the screen during typing makes it possible to provide continuous visual feedback, which is beneficial for increasing the input performance. We first examine users' gesture typing ability in terms of the appropriate keyboard size and location in motor space and then compare the typing performance in direct and indirect touch mode. We then propose an improved design to address the uncertainty and inaccuracy of the first touch. Our evaluation result shows that users can quickly acquire indirect gesture typing, and type 22.3 words per minute after 30 phases, which significantly outperforms previous numbers in literature. Our work provides the empirical support for leveraging gesture typing for indirect touch.
Speech input, such as voice assistant and voice message, is an attractive interaction option for mobile users today. However, despite its popularity, there is a use limitation for smartphone speech input: users need to press a button or say a wake word to activate it before use, which is not very convenient. To address it, we match the motion that brings the phone to mouth with the user's intention to use voice input. In this paper, we present ProxiTalk, an interaction technique that allows users to enable smartphone speech input by simply moving it close to their mouths. We study how users use ProxiTalk and systematically investigate the recognition abilities of various data sources (e.g., using a front camera to detect facial features, using two microphones to estimate the distance between phone and mouth). Results show that it is feasible to utilize the smartphone's built-in sensors and instruments to detect ProxiTalk use and classify gestures. An evaluation study shows that users can quickly acquire ProxiTalk and are willing to use it. In conclusion, our work provides the empirical support that ProxiTalk is a practical and promising option to enable smartphone speech input, which coexists with current trigger mechanisms.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.