2023
DOI: 10.1177/17298806231164831
|View full text |Cite
|
Sign up to set email alerts
|

Visual localization with a monocular camera for unmanned aerial vehicle based on landmark detection and tracking using YOLOv5 and DeepSORT

Abstract: Absolute visual localization is of significant importance for unmanned aerial vehicles when the satellite-based localization system is not available. With the rapid evolution in the field of deep learning, the real-time visual detection and tracking of landmarks by an unmanned aerial vehicle could be implemented onboard. This study demonstrates a landmark-based visual localization framework for unmanned aerial vehicles flying at low altitudes. YOLOv5 and DeepSORT are used for multi-object detection and trackin… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
1
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 8 publications
(3 citation statements)
references
References 55 publications
0
1
0
Order By: Relevance
“…These characteristics of YOLOv5 make it suitable for our application because UAVs have limited power and computational resources. A real‐world flight experiment was conducted using a quadcopter UAV flying at low altitude and using YOLOv5 for object recognition [32].…”
Section: Performance Evaluationmentioning
confidence: 99%
“…These characteristics of YOLOv5 make it suitable for our application because UAVs have limited power and computational resources. A real‐world flight experiment was conducted using a quadcopter UAV flying at low altitude and using YOLOv5 for object recognition [32].…”
Section: Performance Evaluationmentioning
confidence: 99%
“…Visual odometry (VO) and VSLAM are two closely related techniques that are used to determine a robot or machine’s location and orientation through the analysis of corresponding camera images. Both techniques can utilize a monocular camera, but they have distinct characteristics and objectives [ 6 , 7 , 8 ].…”
Section: Introductionmentioning
confidence: 99%
“…Multi-vision requires multiple cameras for shooting, resulting in higher costs, and also requires addressing the feature matching issue from different cameras, which leads to complex operations [16,17]. In contrast, monocular vision only requires a single camera, enabling implementation through the pinhole imaging principle, resulting in lower costs and convenient operation [18,19]. The commonly used methods for monocular vision positioning include the Perspective-n-Point (PNP) method [20], the 2 of 14 imaging model method [21], the data regression modeling method [22], and the geometric relationship method [23].…”
Section: Introductionmentioning
confidence: 99%