In this paper, we applied the concept of diminished reality to remove content-irrelevant pedestrian (i.e., real object) in the context of handheld augmented reality (AR). We prepared three view conditions: in Transparent (TP) condition, we removed the pedestrian entirely; in Semi-transparent (STP) condition, the pedestrian became semi-transparent; lastly, in Default (DF) condition, the pedestrian appeared as is. We conducted a user study to compare the effects of the three conditions on users' engagement and perception of a virtual pet in the AR content. Our findings revealed that users felt less distracted to the AR content in TP and STP conditions, compared to the DF condition. Furthermore, users felt the virtual pet as more lifelike , its behavior more plausible, and felt a higher spatial presence in the real environment, in the TP condition. CCS CONCEPTS • Human-centered computing → User studies; Mixed / augmented reality.
People are interested in traveling in an infinite virtual environment, but no standard navigation method exists yet in Virtual Reality (VR). The Walking-In-Place (WIP) technique is a navigation method that simulates movement to enable immersive travel with less simulator sickness in VR. However, attaching the sensor to the body is troublesome. A previously introduced method that performed WIP using an Inertial Measurement Unit (IMU) helped address this problem. That method does not require placement of additional sensors on the body. That study proved, through evaluation, the acceptable performance of WIP. However, this method has limitations, including a high step-recognition rate when the user does various body motions within the tracking area. Previous works also did not evaluate WIP step recognition accuracy. In this paper, we propose a novel WIP method using position and orientation tracking, which are provided in the most PC-based VR HMDs. Our method also does not require additional sensors on the body and is more stable than the IMU-based method for non-WIP motions. We evaluated our method with nine subjects and found that the WIP step accuracy was 99.32% regardless of head tilt, and the error rate was 0% for squat motion, which is a motion prone to error. We distinguish jog-in-place as “intentional motion” and others as “unintentional motion”. This shows that our method correctly recognizes only jog-in-place. We also apply the saw-tooth function virtual velocity to our method in a mathematical way. Natural navigation is possible when the virtual velocity approach is applied to the WIP method. Our method is useful for various applications which requires jogging.
Virtual humans (VHs) in augmented reality (AR) can provide users an illusory sense of being together in the real space. However, such an illusion can easily break when the augmented VH conflicts (or is overlaid) with the real objects. Recent spatial understanding technology is starting to make physically plausible VHs in response to collisions, but there are still limitations (e.g., resolution, accuracy) and inevitable conflict situations (e.g., unexpected passer-by), especially in daily life. Moreover, depending on the situation, VH's plausible behavior to avoid collision may rather interfere with the original interaction with the users. In this paper, we investigate three such situations: (1) when VH appears in a room through a closed door, (2) when the VH's body overlaps with static real objects, and (3) when a real moving object passes through the VH. While we considered (2) as an avoidable situation where physically plausible behaviors of VHs might be required, (1) and (3) were considered as inevitable situations (e.g., VH appearing out of nowhere, or passer-by cannot be aware of a virtual being), and we may not present VH's plausible behaviors, so alternatives might be required. Thus, for each of these notable situations in AR, we tested different visual effects as presentation methods for physical conflicts between a VH and real objects. Our findings indicate that visual effects improve VH's social/co-presence and physicality depending on the situations and effect types as well as influence users' attention/social behaviors. We discuss the implications of our findings and future research directions.INDEX TERMS Augmented reality, pervasive AR, virtual human, visual effects, perceptual issue, physicality conflict, social presence, co-presence, inevitable collision, human perception.
Pan-tilt-zoom (PTZ) and omnidirectional cameras serve as a video-mediated communication interface for telemedicine. Most cases use either PTZ or omnidirectional cameras exclusively; even when used together, images from the two are shown separately on 2D displays. Conventional foveated imaging techniques may offer a solution for exploiting the benefits of both cameras, i.e., the high resolution of the PTZ camera and the wide field-of-view of the omnidirectional camera, but displaying the unified image on a 2D display would reduce the benefit of “omni-” directionality. In this paper, we introduce a foveated imaging pipeline designed to support virtual reality head-mounted displays (HMDs). The pipeline consists of two parallel processes: one for estimating parameters for the integration of the two images and another for rendering images in real time. A control mechanism for placing the foveal region (i.e., high-resolution area) in the scene and zooming is also proposed. Our evaluations showed that the proposed pipeline achieved, on average, 17 frames per second when rendering the foveated view on an HMD, and showed angular resolution improvement on the foveal region compared with the omnidirectional camera view. However, the improvement was less significant when the zoom level was 8× and more. We discuss possible improvement points and future research directions.
Interactions with embodied conversational agents can be enhanced using human-like co-speech gestures. Traditionally, rule-based co-speech gesture mapping has been utilized for this purpose. However, the creation of this mapping is laborious and often requires human experts. Moreover, human-created mapping tends to be limited, therefore prone to generate repeated gestures. In this article, we present an approach to automate the generation of rule-based co-speech gesture mapping from publicly available large video data set without the intervention of human experts. At run-time, word embedding is utilized for rule searching to get the semantic-aware, meaningful, and accurate rule. The evaluation indicated that our method achieved comparable performance with the manual map generated by human experts, with a more variety of gestures activated. Moreover, synergy effects were observed in users' perception of generated co-speech gestures when combined with the manual map.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.