Abstract:There are several benefits to constructing a lightweight vision system that is implemented directly on limited hardware devices. Most deep learning-based computer vision systems, such as YOLO (You Only Look Once), use computationally expensive backbone feature extractor networks, such as ResNet and Inception network. To address the issue of network complexity, researchers created SqueezeNet, an alternative compressed and diminutive network. However, SqueezeNet was trained to recognize 1000 unique objects as a … Show more
“…Over the years, activity within the UAV autonomy research space has led to a steady increase in published research on novel solutions for autonomous navigation features. Many of these projects use trained models to infer a solution to autonomous UAV tasks [1][2][3][4]. Previous reviews of state-of-the-art solutions in the autonomous navigation research space revealed that, of the classified autonomous features, collision avoidance, obstacle detection, and object distinction (including object detection) were the most popular research topics.…”
To facilitate the integration of autonomous unmanned air vehicles (UAVs) in day-to-day life, it is imperative that safe navigation can be demonstrated in all relevant scenarios. For UAVs using a navigational protocol driven by artificial neural networks, training and testing data from multiple environmental contexts are needed to ensure that bias is minimised. The reduction in predictive capacity when faced with unfamiliar data is a common weak point in trained networks, which worsens the further the input data deviates from the training data. However, training for multiple environmental variables dramatically increases the man-hours required for data collection and validation. In this work, a potential solution to this data availability issue is presented through the generation and evaluation of photo-realistic image datasets from a simulation of 3D-scanned physical spaces which are theoretically linked in a digital twin (DT) configuration. This simulation is then used to generate environmentally varied iterations of the target object in that physical space by two contextual variables (weather and daylight). This results in an expanded dataset of bicycles that contains weather and time-varied components of the same images which are then evaluated using a generic build of the YoloV3 object detection network; the response is then compared to two real image (night and day) datasets as a baseline. The results reveal that the network response remained consistent across the temporal axis, maintaining a measured domain shift of approximately 23% between the two baselines.
“…Over the years, activity within the UAV autonomy research space has led to a steady increase in published research on novel solutions for autonomous navigation features. Many of these projects use trained models to infer a solution to autonomous UAV tasks [1][2][3][4]. Previous reviews of state-of-the-art solutions in the autonomous navigation research space revealed that, of the classified autonomous features, collision avoidance, obstacle detection, and object distinction (including object detection) were the most popular research topics.…”
To facilitate the integration of autonomous unmanned air vehicles (UAVs) in day-to-day life, it is imperative that safe navigation can be demonstrated in all relevant scenarios. For UAVs using a navigational protocol driven by artificial neural networks, training and testing data from multiple environmental contexts are needed to ensure that bias is minimised. The reduction in predictive capacity when faced with unfamiliar data is a common weak point in trained networks, which worsens the further the input data deviates from the training data. However, training for multiple environmental variables dramatically increases the man-hours required for data collection and validation. In this work, a potential solution to this data availability issue is presented through the generation and evaluation of photo-realistic image datasets from a simulation of 3D-scanned physical spaces which are theoretically linked in a digital twin (DT) configuration. This simulation is then used to generate environmentally varied iterations of the target object in that physical space by two contextual variables (weather and daylight). This results in an expanded dataset of bicycles that contains weather and time-varied components of the same images which are then evaluated using a generic build of the YoloV3 object detection network; the response is then compared to two real image (night and day) datasets as a baseline. The results reveal that the network response remained consistent across the temporal axis, maintaining a measured domain shift of approximately 23% between the two baselines.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.