A novel interaction method for mobile phones using their built-in cameras is presented. By estimating the path connecting the center points of frames captured by the camera phone, objects of interest can be easily extracted and recognized. To estimate the movement of the mobile phone, corners and corresponding Speeded-Up Robust Features descriptors are used to calculate the spatial transformation parameters between the previous and current frames. These parameters are then used to recalculate the locations of the center points in the previous frame into the current frame. The experiment results obtained from real image sequences show that the proposed system is efficient, flexible, and able to provide accurate and stable results.Keywords: Interaction method, global motion estimation, mobile device, SURF, corner detection. Manuscript received Jan. 31, 2012; revised May 23, 2012; accepted June 22, 2012. This work was supported by the Basic Science Research Program through the National Research Fund of Korea (NRF), funded by the Ministry of Education, Science and Technology (2011-0006109) and .Toan Dinh Nguyen (+82 62 530 3425, toan_mulmi@hotmail.com), JeongHwan Kim (disturb@naver.com), SooHyung Kim (shkim@jnu.ac.kr), HyungJeong Yang (hjyang@jnu.ac.kr), and GueeSang Lee (corresponding author, gslee@jnu.ac.kr)
I. IntroductionThe mobile phone is now the most widely used communication technology device in the world. With the diversity of applications adopted for the mobile phone, new interaction methods should be developed to cope with the limitation of the interface due to its small size. Sensors embedded in the device can be used for physical manipulation, such as contact, pressure, tilt, and implicit biometric information. The tilt input can be used for navigating menus, maps, and 3-D scenes or scrolling through documents and lists [1]- [3]. Another interaction method is based on the motion of mobile phones. Estimation of device motion often makes use of motion sensors, such as accelerometers [3], [4]. However, in this case, an external sensor needs to be installed. Since computer vision is a more natural choice, we focus herein on using computer vision in interaction systems. Previous studies cover multimodal interaction [5], gesture recognition, face tracking, and body tracking [6]. However, most of these methods are built on powerful desktop computers.In TinyMotion [7], [8], image sequences captured by built-in cameras are analyzed in real time, without requiring additional sensors. TinyMotion is based on both image differencing and correlation of blocks for motion estimation (ME) and can efficiently detect horizontal, vertical, rotational, and tilt movements. However, this method detects only the moving directions (up, down, left, and right) and does not provide information about the location of the camera and the distance of the movement. Another approach to estimate camera motion is to use the global ME (GME) technique, which has been widely used in many applications, such as video coding, video stabiliz...