The farming industry constantly seeks the automation of different processes involved in agricultural production, such as sowing, harvesting and weed control. The use of mobile autonomous robots to perform those tasks is of great interest. Arable lands present hard challenges for Simultaneous Localization and Mapping (SLAM) systems, key for mobile robotics, given the visual difficulty due to the highly repetitive scene and the crop leaves movement caused by the wind. In recent years, several Visual-Inertial Odometry (VIO) and SLAM systems have been developed.They have proved to be robust and capable of achieving high accuracy in indoor and outdoor urban environments. However, they were not properly assessed in agricultural fields. In this study we assess the most relevant state-of-the-art VIO systems in terms of accuracy and processing time on arable lands to better understand how they behave on these environments. In particular, the evaluation is carried out on a collection of sensor data recorded by our wheeled robot in a soybean field, which was publicly released as the Rosario data set. The evaluation shows that the highly repetitive appearance of the environment, the strong vibration produced by the rough terrain and the movement of the leaves caused by the wind, expose the limitations of the current state-of-the-art VIO and SLAM systems. We analyze the systems failures and highlight the observed drawbacks, including initialization failures, tracking loss and sensitivity to Inertial Measurement Unit saturation. Finally, we conclude that even though certain systems like ORB-SLAM3 and stereo Multi-State Constraint Kalman Filter show good results with respect to others, more improvements should be done to make them reliable in agricultural fields for certain applications such as soil tillage of crop rows and pesticide spraying.
Los sistemas tradicionales de odometría visual (VO), directos o basados en características visuales, son susceptibles de cometer errores de correspondencia entre imágenes. Además, las configuraciones monoculares sólo son capaces de estimar la localización sujeto a un factor de escala, lo que hace imposible su uso inmediato en aplicaciones de robótica o realidad virtual. Recientemente, varios problemas de Visión por Computadora han sido abordados con éxito por algoritmos de Aprendizaje Profundo. En este trabajo presentamos un sistema de odometría visual monocular basado en Aprendizaje Profundo llamado WGANVO. Específicamente, entrenamos una red neuronal basada en GAN para regresionar una estimación de movimiento. El modelo resultante recibe un par de imágenes y estima el movimiento relativo entre ellas. Entrenamos la red neuronal utilizando un enfoque semi-supervisado. A diferencia de los sistemas monoculares tradicionales basados en geometría, nuestro método basado en Deep Learning es capaz de estimar la escala absoluta de la escena sin información extra ni conocimiento previo. Evaluamos WGANVO en el conocido conjunto de datos KITTI. Demostramos que nuestro sistema funciona en tiempo real y la precisión obtenida alienta a seguir desarrollando sistemas de localización basados en Aprendizaje Profundo.
The accelerating pace in the automation of agricultural tasks demands highly accurate and robust localization systems for field robots. Simultaneous Localization and Mapping (SLAM) methods inevitably accumulate drift on exploratory trajectories and primarily rely on place revisiting and loop closing to keep a bounded global localization error. Loop closure techniques are significantly challenging in agricultural fields, as the local visual appearance of different views is very similar and might change easily due to weather effects. A suitable alternative in practice is to employ global sensor positioning systems jointly with the rest of the robot sensors. In this paper we propose and implement the fusion of global navigation satellite system (GNSS), stereo views, and inertial measurements for localization purposes. Specifically, we incorporate, in a tightly coupled manner, GNSS measurements into the stereo‐inertial ORB‐SLAM3 pipeline. We thoroughly evaluate our implementation in the sequences of the Rosario data set, recorded by an autonomous robot in soybean fields, and our own in‐house data. Our data includes measurements from a conventional GNSS, rarely included in evaluations of state‐of‐the‐art approaches. We characterize the performance of GNSS‐stereo‐inertial SLAM in this application case, reporting pose error reductions between 10% and 30% compared to visual–inertial and loosely coupled GNSS‐stereo‐inertial baselines. In addition to such analysis, we also release the code of our implementation as open source.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.