Multimodal data fusion for object recognition

Knyaz, V. A.

doi:10.1117/12.2526067

Cited by 11 publications

(10 citation statements)

References 18 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In this section, we present the performance of the proposed YOLOrs architecture on the VEhicle Detection in Aerial Imagery (VEDAI) dataset [46]. The performance of YOLOrs is compared with that of unimodal YOLOv4 [26], EfficientDet (D0) [47], RetinaNet (with backbone ResNet 50) [43], and YOLOv3 [18] trained on RGB and IR images as well as their multimodal versions trained on concatenated RGB and IR images [45] as shown in Fig. 6.…”

Section: Experimental Studiesmentioning

confidence: 99%

See 1 more Smart Citation

YOLOrs: Object Detection in Multimodal Remote Sensing Imagery

Sharma

Dhanaraj

Karnam

et al. 2021

IEEE J. Sel. Top. Appl. Earth Observations Remote Sensing

View full text Add to dashboard Cite

Deep-learning object detection methods that are designed for computer vision applications tend to under-perform when applied to remote sensing data. This is because, contrary to computer vision, in remote sensing training data are harder to collect and targets can be very small, occupying only a few pixels in the entire image, and exhibit arbitrary perspective transformations. Detection performance can improve by fusing data from multiple remote sensing modalities, including RGB, IR, hyper-spectral, multi-spectral, synthetic aperture radar, and LiDAR, to name a few. In this work, we propose YOLOrs: a new convolutional neural network, specifically designed for realtime object detection in multimodal remote sensing imagery. YOLOrs can detect objects at multiple scales, with smaller receptive fields to account for small targets, as well as predict target orientations. In addition, YOLOrs introduces a novel midlevel fusion architecture that renders it applicable to multimodal aerial imagery. Our experimental studies compare YOLOrs with contemporary alternatives and corroborate its merits.

show abstract

Section: Experimental Studiesmentioning

confidence: 99%

“…In Fig. 10b, we plot the mAP values of multimodal approaches versus training epoch index and observe that the proposed YOLOrs outperforms YOLOv3 [45], RetinaNet, EfficientDet (D0) [47], and YOLOv4 [26] with early fusion approaches.…”

Section: ) Multimodal Datamentioning

confidence: 99%

YOLOrs: Object Detection in Multimodal Remote Sensing Imagery

Sharma

Dhanaraj

Karnam

et al. 2021

IEEE J. Sel. Top. Appl. Earth Observations Remote Sensing

View full text Add to dashboard Cite

show abstract

“…To provide synchronization of collected color and thermal images the technique based on scene 3D reconstruction (Knyaz, 2019) was applied. Both color and thermal image sequences were used for scene 3D reconstruction by structure from motion technique.…”

Section: Scene 3d Model Reconstructionmentioning

confidence: 99%

“…The multimodal dataset generated and augmented using this technique was used for CNN training for the tasks of object detection and object re-identification (Knyaz, 2019), (Kniaz and Knyaz, 2019). The evaluation of CNN trained on the created multimodal dataset showed improving of the CNN performance for considered tasks.…”

Section: Figure 7 Image Orientation and Synchronizationmentioning

confidence: 99%

Joint Geometric Calibration of Color and Thermal Cameras for Synchronized Multimodal Dataset Creating

Knyaz

Moshkantsev

2019

Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci.

View full text Add to dashboard Cite

Abstract. With increasing performance and availability of thermal cameras the number of applications using them in various purposes grows noticeable. Nowadays thermal vision is widely used in industrial control and monitoring, thermal mapping of industrial areas, surveillance and robotics which output huge amount of thermal images. This circumstance creates the necessary basis for applying deep learning which demonstrates the state-of-the-art performance for the most complicated computer vision tasks. Using different modalities for scene analysis allows to outperform results of mono-modal processing, but in case of machine learning it requires synchronized annotated multimodal dataset. The prerequisite condition for such dataset creating is geometric calibration of sensors used for image acquisition. So the purpose of the performed study was to develop a technique for joint calibration of color and long wave infra-red cameras which are to be used for collecting multimodal dataset needed for the tasks of computer vision algorithms developing and evaluating.The paper presents the techniques for camera parameters estimation and experimental evaluation of interior orientation of color and long wave infra-red cameras for further exploiting in datasets collecting. Also the results of geometrically calibrated camera exploiting for 3D reconstruction and 3D model realistic texturing based on visible and thermal imagery are presented. They proved the effectivity of the developed techniques for collecting and augmenting synchronized multimodal imagery dataset for convolutional neural networks model training and evaluating.

show abstract

“…The LEART training set was used to train the modified network architecture [6]. This sample was collected using a DJI Mavic PRO UAV, equipped with an integrated visible camera, and an additional far-infrared camera (8-14 μm) MH-SM576-6 with a resolution of 640 × 480 pixels.…”

Section: Dataset Generationmentioning

confidence: 99%

Segmentation and visualization of obstacles for the enhanced vision system using generative adversarial networks

Kniaz

Kozyrev

Bordodymov

et al. 2019

View full text Add to dashboard Cite

Long range infrared cameras may provide increasing crew situational awareness in limited vision and night conditions. Similar cameras are installed in modern civil aircraft's as part of an improved vision system. Correct thermal image interpretation by the crew requires certain experience, due to the fact that view of the scene very different from the visible range and may change within time of day and season. This paper discusses the deep generative-adversary neural network to automatically convert thermal images to semantically similar color images of the visible range.

show abstract

Multimodal data fusion for object recognition

Cited by 11 publications

References 18 publications

YOLOrs: Object Detection in Multimodal Remote Sensing Imagery

YOLOrs: Object Detection in Multimodal Remote Sensing Imagery

Joint Geometric Calibration of Color and Thermal Cameras for Synchronized Multimodal Dataset Creating

Segmentation and visualization of obstacles for the enhanced vision system using generative adversarial networks

Contact Info

Product

Resources

About