The goal of this research is to explore new interaction metaphors for augmented reality on mobile phones, i.e. applications where users look at the live image of the device's video camera and 3D virtual objects enrich the scene that they see. Common interaction concepts for such applications are often limited to pure 2D pointing and clicking on the device's touch screen. Such an interaction with virtual objects is not only restrictive but also difficult, for example, due to the small form factor. In this article, we investigate the potential of finger tracking for gesture-based interaction. We present two experiments evaluating canonical operations such as translation, rotation, and scaling of virtual objects with respect to performance (time and accuracy) and engagement (subjective user feedback). Our results indicate a high entertainment value, but low accuracy if objects are manipulated in midair, suggesting great possibilities for leisure applications but limited usage for serious tasks.
The Lifelog Search Challenge (LSC) is an international content retrieval competition that evaluates search for personal lifelog data. At the LSC, content-based search is performed over a multi-modal dataset, continuously recorded by a lifelogger over 27 days, consisting of multimedia content, biometric data, human activity data, and information activities data. In this work, we report on the first LSC that took place in Yokohama, Japan in 2018 as a special workshop at ACM International Conference on Multimedia Retrieval 2018 (ICMR 2018). We describe the general idea of this challenge, summarise the participating search systems as well as the evaluation procedure, and analyse the search performance of the teams in various aspects. We try to identify reasons why some systems performed better than others and provide an outlook as well as open issues for upcoming iterations of the challenge.
The Lifelog Search Challenge (LSC) is an annual comparative benchmarking activity for comparing approaches to interactive retrieval from multi-modal lifelogs. LSC'20, the third such challenge, attracts fourteen participants with their interactive lifelog retrieval systems. These systems are comparatively evaluated in front of a live-audience at the LSC workshop at ACM ICMR'20 in Dublin, Ireland. This overview motivates the challenge, presents the dataset and system configuration used in the challenge, and briefly presents the participating teams. CCS CONCEPTS • Human-centered computing → Empirical studies in interaction design; • Information systems → Mobile information processing systems; Search interfaces.
We present recent w ork on improving the performance of automated speech recognizers by using additional visual information Lip-Speechreading, achieving error reduction of up to 50. This paper focuses on di erent methods of combining the visual and acoustic data to improve the recognition performance. We show this on an extension of an existing state-of-the-art speech recognition system, a modular MS-TDNN. We have developed adaptive combination methods at several levels of the recognition network. Additional information such as estimated signal-to-noise ratio SNR is used in some cases. The results of the di erent combination methods are shown for clean speech and data with arti cial noise white, music, motor. The new combination methods adapt automatically to varying noise conditions making hand-tuned parameters unnecessary.
Today, videos can be replayed on modern handheld devices, such as multimedia cellphones and personal digital assistants (PDAs), due to significant improvements in their processing power. However, screen size remains a limiting resource making it hard, if not impossible to adapt common approaches for video browsing to such mobile devices. In this paper we propose a new interface for the pen-based navigation of videos on PDAs and multimedia cellphones. Our solution -called the MobileZoomSlider -enables users to intuitively skim a video along the timeline on different granularity levels. In addition, it allows for continuous manipulation of replay speed for browsing purposes. Both interaction concepts are seamlessly integrated into the overall interface, thus taking optimum advantage of the limited screen space. Our claims are verified with a first evaluation which proves the suitability of the overall concept.
Interactive video retrieval tools developed over the past few years are emerging as powerful alternatives to automatic retrieval approaches by giving the user more control as well as more responsibilities. Current research tries to identify the best combinations of image, audio and text features that combined with innovative UI design maximize the tools (2017) 76:5539-5571 performance. We present the last installment of the Video Browser Showdown 2015 which was held in conjunction with the International Conference on MultiMedia Modeling 2015 (MMM 2015) and has the stated aim of pushing for a better integration of the user into the search process. The setup of the competition including the used dataset and the presented tasks as well as the participating tools will be introduced . The performance of those tools will be thoroughly presented and analyzed. Interesting highlights will be marked and some predictions regarding the research focus within the field for the near future will be made.
Distributing recorded classroom lectures via podcasting for replay on mobile devices is gaining increasing popularity. However, few insights exist regarding the actual usage and usefulness of such files, especially in situations where high-quality recordings of those lectures are available for non-mobile replay as well. In this paper, we compare the results of two surveys done with local students who had access to podcasts as well as high-quality files for replay on laptops and desktop PCs on the one hand and external users who just subscribed to the podcasts on the other hand. We compare the usage of the different versions, address the motivations of the two different user groups, and discuss general issues such as perception of the quality of the audio and video signals. Based on our observations we conclude that the added value of such "e-lecture podcasts" is mainly in its potential for mobile usage, whereas most of the other arguments given in favor of such an e-lecture delivery are rather due to the better visibility and "advertisement" of podcasts then justifiable by the technology involved.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
334 Leonard St
Brooklyn, NY 11211
Copyright © 2023 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.