Hand gesture recognition attracts great attention for interaction since it is intuitive and natural to perform. In this paper, we explore a novel method for interaction by using boneconducted sound generated by finger movements while performing gestures. We design a set of gestures that generate unique sound features, and capture the resulting sound from the wrist using a commodity microphone. Next, we design a sound event detector and a recognition model to classify the gestures. Our system achieves an overall accuracy of 90.13% in quiet environments and 85.79% under noisy conditions. This promising technology can be deployed on existing smartwatches as a low power service at no additional cost, and can be used for interaction in augmented and virtual reality applications.
Describing video content automatically in natural language sentences is a long‐standing challenge in computer vision. Although existing methods that capture relational information among objects have made significant strides in the past years, the detailed geometrical and temporal information of objects remains to be further explored. To address this problem, a novel Spatio‐Temporal Aware Graph is proposed to capture more elaborate visual representations, which are able to exploit the detailed spatio‐temporal clues of the extracted object features. By performing graph‐structured aggregation, the proposed model is capable of capturing not only the interactions among objects but also the detailed spatio‐temporal relations. Meanwhile, a Frame Similarity Graph is constructed on frame features to learn comprehensive representations, which aim to extract the global information that the object feature lacks. Moreover, to capture rich video semantics from different perspectives, multiple video representations, that is appearance and motion information, are utilised to learn discriminative representations. Experiments on the prevalent benchmarks: Microsoft Video Description Corpus and Microsoft Research Video to Text demonstrate that the proposed approach achieves state‐of‐the‐art performance in several widely used evaluation metrics: BLEU‐4, METEOR, ROUGE, and CIDEr.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.