3D Human Reconstruction in the Wild with Collaborative Aerial Cameras

Ho, Cherie; Jong, A.F. de; Freeman, Harry; Rao, Rohan; Bonatti, Rogerio; Scherer, Sebastian

doi:10.48550/arxiv.2108.03936

Cited by 1 publication

(3 citation statements)

References 22 publications

(29 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The person is localised, but their 3D skeleton isn't estimated. In very recent work, Ho et al [10] use two DJI M210s to autonomously fly around a subject who is jogging or playing football, in the context of a known 3D map. Collision avoidance is performed online, with the important limitation of requiring the scene to be mapped ahead of time, but skeletal reconstruction is deferred to a later, offline stage.…”

Section: Aerial Human Motion Capturementioning

confidence: 99%

“…However, for monocular 3D human pose estimation ( §II-B), very few methods can estimate the skeletons of multiple people in real time. Moreover, whilst singleperson 3D human pose estimation has been performed from a monocular drone, many approaches rely on markers [7] or offline post-processing [8], [9], [10]. To our knowledge, no approach has yet deployed real-time markerless multi-person 3D human pose estimation for a monocular drone.…”

mentioning

confidence: 99%

“…Notably, simultaneous scene reconstruction and 3D human pose estimation from a drone (monocular or not) has also received limited attention. Some cinematography ( §II-C) works have used an offline map to allow drones to avoid obstacles during filming [10], or performed online mapping and (non-skeletal) person tracking using a large LiDAR-equipped drone [11]. However, we are not aware of any works that currently perform both online mapping and skeletal pose estimation, particularly for multiple people and from a monocular drone.…”

mentioning

confidence: 99%

See 2 more Smart Citations

Real-Time Hybrid Mapping of Populated Indoor Scenes using a Low-Cost Monocular UAV

Golodetz¹,

Vankadari²,

Everitt³

et al. 2022

Preprint

View full text Add to dashboard Cite

Unmanned aerial vehicles (UAVs) have been used for many applications in recent years, from urban search and rescue, to agricultural surveying, to autonomous underground mine exploration. However, deploying UAVs in tight, indoor spaces, especially close to humans, remains a challenge. One solution, when limited payload is required, is to use micro-UAVs, which pose less risk to humans and typically cost less to replace after a crash. However, micro-UAVs can only carry a limited sensor suite, e.g. a monocular camera instead of a stereo pair or LiDAR, complicating tasks like dense mapping and markerless multi-person 3D human pose estimation, which are needed to operate in tight environments around people. Monocular approaches to such tasks exist, and dense monocular mapping approaches have been successfully deployed for UAV applications. However, despite many recent works on both marker-based and markerless multi-UAV single-person motion capture, markerless single-camera multi-person 3D human pose estimation remains a much earlier-stage technology, and we are not aware of existing attempts to deploy it in an aerial context. In this paper, we present what is thus, to our knowledge, the first system to perform simultaneous mapping and multi-person 3D human pose estimation from a monocular camera mounted on a single UAV. In particular, we show how to loosely couple state-of-the-art monocular depth estimation and monocular 3D human pose estimation approaches to reconstruct a hybrid map of a populated indoor scene in real time. We validate our component-level design choices via extensive experiments on the large-scale ScanNet and GTA-IM datasets. To evaluate our system-level performance, we also construct a new Oxford Hybrid Mapping dataset of populated indoor scenes. I. INTRODUCTIONRecent years have seen huge improvements in the flight stability and obstacle avoidance capabilities of unmanned aerial vehicles, driven by applications including aerial search and rescue [1], aerial tracking and surveillance [2], drone cinematography [3], robotic agriculture [4], and the exploration of everything from mines [5] to other planets [6]. However, deploying drones in confined indoor spaces close to people remains challenging. This is unfortunate, because numerous applications, from awareness systems for emergency responders to indoor drone cinematography for film-makers, could benefit significantly from such a capability.To operate in such an environment, it is helpful for a drone to be able to both map its geometry and detect/track the people moving within it, ideally in real time. At the same time, however, the physical constraints imposed by the environment encourage the use of a small drone (e.g. ≈10cmAll authors are with the University of Oxford. M. Vankadari, A. Everitt and S. Shin assert joint second authorship.

show abstract

Section: Aerial Human Motion Capturementioning

confidence: 99%

mentioning

confidence: 99%

mentioning

confidence: 99%

See 1 more Smart Citation