“…This method requires reliable detection of body joints and reliable tracking, which are difficult to achieve in computer vision today. Further, several other methods [6,35,19,17,34] were proposed that applied 3D reconstruction for multi-view action recognition. However, 3D reconstruction or 3D structure feature requires a reliable depth camera and strict coordinate of views, which are computationally expensive.…”