“…Sequence-based treats the 3D-skeleton data as a multi-dimensional time-series and models it with a recurrent architecture [21,22,32,35,46] to learn the temporal dynamics of the joints. Image-based create a pseudo-image representation of the 3D-skeleton data [7,12,17,23,38] which is encoded by CNN architectures to model the co-occurrence of multiple joints and their motion. Finally, graph-based [4,13,18,24,31,33,37,44] represents the 3D-skeleton data with a graph consisting of spatial and temporal edges.…”