Abstract-In this paper, we consider the problem of tracking human motion with a 22-DOF kinematic model from depth images. In contrast to existing approaches, our system naturally scales to multiple sensors. The motivation behind our approach, termed Multiple Depth Camera Approach (MDCA), is that by using several cameras, we can significantly improve the tracking quality and reduce ambiguities as for example caused by occlusions. By fusing the depth images of all available cameras into one joint point cloud, we can seamlessly incorporate the available information from multiple sensors into the pose estimation. To track the high-dimensional human pose, we employ state-of-the-art annealed particle filtering and partition sampling. We compute the particle likelihood based on the truncated signed distance of each observed point to a parameterized human shape model. We apply a coarse-tofine scheme to recognize a wide range of poses to initialize the tracker. In our experiments, we demonstrate that our approach can accurately track human motion in real-time (15Hz) on a GPGPU. In direct comparison to two existing trackers (OpenNI, Microsoft Kinect SDK), we found that our approach is significantly more robust for unconstrained motions and under (partial) occlusions.