2020
DOI: 10.1007/978-3-030-45439-5_35
|View full text |Cite
|
Sign up to set email alerts
|

Motion Words: A Text-Like Representation of 3D Skeleton Sequences

Abstract: There is a growing amount of human motion data captured as a continuous 3D skeleton sequence without any information about its semantic partitioning. To make such unsegmented and unlabeled data efficiently accessible, we propose to transform them into a text-like representation and employ well-known text retrieval models. Specifically, we partition each motion synthetically into a sequence of short segments and quantize the segments into motion words, i.e. compact features with similar characteristics as words… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
17
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
4
2
1

Relationship

2
5

Authors

Journals

citations
Cited by 14 publications
(17 citation statements)
references
References 24 publications
0
17
0
Order By: Relevance
“…However, there are also alternative approaches that cut the unsegmented data into a higher number of short overlapping segments, represent these by very compact features, and then compare sequences of such features. To obtain the compact features, the short segments are first represented by arbitrary high-dimensional features (e.g., raw skeleton data in [39] or deep features in [19]), then the space of the segment features is clustered, and cluster identifiers are used to form a vocabulary (codebook). Individual segments are then represented by the one-dimensional identifiers of the closest cluster, so the similarity of two short segments is reduced to a trivial equality over the quantized features.…”
Section: E Feature Transformationmentioning
confidence: 99%
See 3 more Smart Citations
“…However, there are also alternative approaches that cut the unsegmented data into a higher number of short overlapping segments, represent these by very compact features, and then compare sequences of such features. To obtain the compact features, the short segments are first represented by arbitrary high-dimensional features (e.g., raw skeleton data in [39] or deep features in [19]), then the space of the segment features is clustered, and cluster identifiers are used to form a vocabulary (codebook). Individual segments are then represented by the one-dimensional identifiers of the closest cluster, so the similarity of two short segments is reduced to a trivial equality over the quantized features.…”
Section: E Feature Transformationmentioning
confidence: 99%
“…Individual segments are then represented by the one-dimensional identifiers of the closest cluster, so the similarity of two short segments is reduced to a trivial equality over the quantized features. The skeleton sequences can be then represented by sequences [39], histograms [19], [68], [69] or bags [40] of the quantized features. The bag-of-words representation proposed in [40] is mainly interesting from the large-scale processing perspective, since it enables application of efficient and scalable text-retrieval techniques (e.g., inverted files) for a variety of motion processing tasks.…”
Section: E Feature Transformationmentioning
confidence: 99%
See 2 more Smart Citations
“…The efficiencyoriented works either propose very compact features that allow fast sequential scanning [12,13], or utilize various indexing schemes to organize the motion data (e.g., the binary tree [25], kd tree [9], R* tree [4], inverted file index [14], or tries [8]). To optimize the efficiency-effectiveness trade-off, a two-phase retrieval model is often used, where the candidate objects identified within an efficient search phase are submitted to a re-ranking phase that refines the result using more expensive techniques (e.g., traversal of a graph structure [9] or ranking by the Dynamic Time Warping [14,20]). A more thorough discussion and comparison of all these methods can be found in the recent survey [21].…”
Section: Introductionmentioning
confidence: 99%