We present a challenging dataset, the TartanAir, for robot navigation task and more. The data is collected in photo-realistic simulation environments in the presence of various light conditions, weather and moving objects. By collecting data in simulation, we are able to obtain multimodal sensor data and precise ground truth labels, including the stereo RGB image, depth image, segmentation, optical flow, camera poses, and LiDAR point cloud. We set up a large number of environments with various styles and scenes, covering challenging viewpoints and diverse motion patterns, which are difficult to achieve by using physical data collection platforms. In order to enable data collection in such large scale, we develop an automatic pipeline, including mapping, trajectory sampling, data processing, and data verification. We evaluate the impact of various factors on visual SLAM algorithms using our data. Results of state-of-the-art algorithms reveal that the visual SLAM problem is far from solved, methods that show good performance on established datasets such as KITTI don't perform well in more difficult scenarios. Although we use the simulation, our goal is to push the limits of Visual SLAM algorithms in the real world by providing a challenging benchmark for testing new methods, as well as large diverse training data for learning-based methods. Our dataset is available at http://theairlab.org/tartanair-dataset.
Aerial cinematography is revolutionizing industries that require live and dynamic camera viewpoints such as entertainment, sports, and security. However, safely piloting a drone while filming a moving target in the presence of obstacles is immensely taxing, often requiring multiple expert human operators. Hence, there is a demand for an autonomous cinematographer that can reason about both geometry and scene context in real-time. Existing approaches do not address all aspects of this problem; they either require high-precision motion-capture systems or global positioning system tags to localize targets, rely on prior maps of the environment, plan for short time horizons, or only follow fixed artistic guidelines specified before the flight. In this study, we address the problem in its entirety and propose a complete system for real-time aerial cinematography that for the first time combines: (a) vision-based target estimation; (b) 3D signed-distance mapping for occlusion estimation; (c) efficient trajectory optimization for long time-horizon camera motion; and (d) learning-based artistic shot selection.We extensively evaluate our system both in simulation and in field experiments by filming dynamic targets moving through unstructured environments. Our results indicate that our system can operate reliably in the real world without restrictive assumptions. We also provide in-depth analysis and discussions for each module, with the hope that our design tradeoffs can generalize to other related applications. Videos of the complete system can be found at https://youtu.be/ookhHnqmlaU. K E Y W O R D S aerial robotics, cinematography, computer vision, learning, mapping, motion planning Within the filming context, this cost function measures jerkiness of motion, safety, environmental occlusion of the actor, and shot quality (artistic quality of viewpoints). This cost function depends on the environment and , and on the actor forecast ξ a , all of which are sensed on-the-fly. The changing nature of the environment and ξ a demands replanning at a high frequency. Here we briefly touch upon the four components of the cost function J(ξ q ) (refer to Section 7 for details and mathematical expressions): Smoothness J smooth (ξ q ): Penalizes jerky motions that may lead to camera blur and unstable flight; Safety J obs (ξ q , ): Penalizes proximity to obstacles that are unsafe for the UAV; Occlusion J occ (ξ q , ξ a , ): Penalizes occlusion of the actor by obstacles in the environment; Shot quality J shot (ξ q , ξ a , Ω art ): Penalizes poor viewpoint angles and scales that deviate from the desired artistic guidelines, given by the set of parameters Ω art . In its simplest form, we can express J(ξ q ) as a linear composition of each individual cost, weighted by scalars λ i . The objective is to * = ( ) ()={ } J J J J J J xyz J J J J J J x y z J occ + J obs 99.4 ± 2.2 94.2 ± 7.3 86.9 ± 9.3 J obs 98.8 ± 3.0 87.1 ± 8.5 75.3 ± 11.8 Avg. dist. to ξ shot (m) J occ + J obs 0.4 ± 0.4 6.2 ± 11.2 10.7 ± 13.2 J obs 0.05 ± 0.1 0.3 ± 0.2 0.5 ± 0...
Arctic clouds can profoundly influence surface radiation and thus surface melt. Over Greenland, these cloud radiative effects (CRE) vary greatly with the diverse topography. To investigate the ability of assorted platforms to reproduce heterogeneous CRE, we evaluate CRE spatial distributions from a satellite product, reanalyses, and a global climate model against estimates from 21 automatic weather stations (AWS). Net CRE estimated from AWS generally decreases with elevation, forming a “warm center” distribution. CRE areal averages from the five large‐scale data sets we analyze are all around 10 W/m2. Modern‐Era Retrospective Analysis for Research and Applications version 2 (MERRA‐2), ERA‐Interim, and Clouds and the Earth's Radiant Energy System (CERES) CRE estimates agree with AWS and reproduce the warm center distribution. However, the National Center for Atmospheric Research Arctic System Reanalysis (ASR) and the Community Earth System Model Large ENSemble Community Project (LENS) show strong warming in the south and northwest, forming a warm L‐shape distribution. Discrepancies are mainly caused by longwave CRE in the accumulation zone. MERRA‐2, ERA‐Interim, and CERES successfully reproduce cloud fraction and its dominant positive influence on longwave CRE in this region. On the other hand, longwave CRE from ASR and LENS correlates strongly with ice water path instead of with cloud fraction or liquid water path. Moreover, ASR overestimates cloud fraction and LENS underestimates liquid water path substantially, both with limited spatial variability. MERRA‐2 best captures the observed interstation changes, captures most of the observed cloud‐radiation physics, and largely reproduces both albedo and cloud properties. The warm center CRE spatial distribution indicates that clouds enhance surface melt in the higher accumulation zone and reduce surface melt in the lower ablation zone.
Autonomous aerial cinematography has the potential to enable automatic capture of aesthetically pleasing videos without requiring human intervention, empowering individuals with the capability of high-end film studios. Current approaches either only handle off-line trajectory generation, or offer strategies that reason over short time horizons and simplistic representations for obstacles, which result in jerky movement and low real-life applicability. In this work we develop a method for aerial filming that is able to trade off shot smoothness, occlusion, and cinematography guidelines in a principled manner, even under noisy actor predictions. We present a novel algorithm for real-time covariant gradient descent that we use to efficiently find the desired trajectories by optimizing a set of cost functions. Experimental results show that our approach creates attractive shots, avoiding obstacles and occlusion 65 times over 1.25 hours of flight time, re-planning at 5Hz with a 10s time horizon. We robustly film human actors, cars and bicycles performing different motion among obstacles, using various shot types.
The impact of clouds on Greenland's surface melt is difficult to quantify due to the limited amount of in situ observations. To better quantify cloud radiative effects (CRE), we utilize 29 automatic weather stations and provide the first analysis on seasonal and hourly timescales in both accumulation and ablation zones. Seasonal CRE shows opposing cycles across geographical regions. CRE generally increases during melt season in the north of Greenland, mainly due to longwave CRE enhancement by cloud fraction and liquid water. CRE decreases from May to July and increases afterwards in the middle and south of Greenland, mainly due to strengthened shortwave CRE caused by surface albedo reduction. This finding resolves the contrasting seasonal distributions in previous studies at different Arctic locations. Longwave seasonal cycle also shifts from increasing in the north to decreasing in the south, implying spatial variations in cloud and atmospheric conditions. On hourly timescales, longwave downwelling radiation exhibits a bimodal distribution peaking at ∼ −70 W/m 2 (i.e., clear state) and ∼ 0 W/m 2 (i.e., cloudy state), suggesting that Greenland alternates between fairly clear and overcast. In the cloudy state, CRE strongly correlates with the combined influence of solar zenith angle and albedo (r = 0.85, p < 0.01) probably because clouds are thick enough for CRE to become saturated. The close relationship between CRE and albedo over dark surfaces suggests a stabilizing feedback: Net CRE is negative at low albedo caused by snow melt and snow metamorphism and thus tends to increase albedo and decelerate surface melt.
The use of drones for aerial cinematography has revolutionized several applications and industries that require live and dynamic camera viewpoints such as entertainment, sports, and security. However, safely controlling a drone while filming a moving target usually requires multiple expert human operators; hence the need for an autonomous cinematographer. Current approaches have severe real-life limitations such as requiring fully scripted scenes, high-precision motion-capture systems or GPS tags to localize targets, and prior maps of the environment to avoid obstacles and plan for occlusion.In this work, we overcome such limitations and propose a complete system for aerial cinematography that combines: (1) a vision-based algorithm for target localization; (2) a real-time incremental 3D signed-distance map algorithm for occlusion and safety computation; and (3) a real-time camera motion planner that optimizes smoothness, collisions, occlusions and artistic guidelines. We evaluate robustness and real-time performance in series of field experiments and simulations by tracking dynamic targets moving through unknown, unstructured environments. Finally, we verify that despite removing previous limitations, our system achieves state-of-the-art performance.
Recently, there is a rising interest in perceiving image aesthetics. The existing works deal with image aesthetics as a classification or regression problem. To extend the cognition from rating to reasoning, a deeper understanding of aesthetics should be based on revealing why a high-or low-aesthetic score should be assigned to an image. From such a point of view, we propose a model referred to as Neural Aesthetic Image Reviewer, which can not only give an aesthetic score for an image, but also generate a textual description explaining why the image leads to a plausible rating score. Specifically, we propose two multi-task architectures based on shared aesthetically semantic layers and task-specific embedding layers at a high level for performance improvement on different tasks. To facilitate researches on this problem, we collect the AVA-Reviews dataset, which contains 52,118 images and 312,708 comments in total. Through multi-task learning, the proposed models can rate aesthetic images as well as produce comments in an end-to-end manner. It is confirmed that the proposed models outperform the baselines according to the performance evaluation on the AVA-Reviews dataset. Moreover, we demonstrate experimentally that our model can generate textual reviews related to aesthetics, which are consistent with human perception.
The air pollution in Chinese Spring Festival (CSF) period over eastern China was investigated using the long-term observations from 2001 to 2012 over 323 stations. The dominant feature of the pollutants around the CSF holidays is the significant reduction of concentration. During the 10 day period around the CSF (but excluding the Lunar New Year's Day, LNYD), PM 10 experiences a reduction of −9.24%. In association with the aerosol reduction, temperature significantly drops over eastern China. From the third day before the LNYD to the second day after, the daily mean temperature anomaly is −0.81 • C, respectively. From the third day to seventh day after the LNYD, the significant negative temperature anomalies move out of China, extending to a broad area from the South China Sea to the western North Pacific. Between the 8th and the 12th days, the significant temperature anomalies can still be found over 140 • N. The reduced downward longwave flux might play an important role in holiday cooling. The possible atmospheric feedback is discernable. The thermal and circulation configuration accompanying the cooling favors baroclinic interaction between upper and lower troposphere for the midlatitude cyclone. The anomalous cyclone becomes mature during the third to the seventh day after the LNYD and disappears 12 days later. The anomalous northern winds in association with the cyclone decrease the temperature and also help disperse the holiday aerosols over eastern China.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.