Multimodal learning and inference from visual and remotely sensed data

Rao, Dushyant; Deuge, Mark De; Nourani-Vatani, Navid; Williams, Stefan B.; Pizarro, Oscar

doi:10.1177/0278364916679892

Cited by 33 publications

(31 citation statements)

References 42 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Typically, the approach involves learning a multilayer feature representation of each modality individually, followed by a multimodal layer that captures the correlations between the high-level single-modality features [3][4][5] [6]. Srivastava and Salakhutdinov [4] utilise a Deep Boltzmann Machine to model the relationship between visual images and associated text keywords.…”

Section: Related Work a Multimodal Learningmentioning

confidence: 99%

“…In our recent work [3], we build on these techniques, utilising a gated model [7] for the multimodal layer. The gated model can be framed as a 'mixture' of feature learners, which allows the model to predict, for a given bathymetric feature, the different types of visual features that may be observed.…”

Section: Related Work a Multimodal Learningmentioning

confidence: 99%

“…The images are taken by our AUV Sirius traversing the seafloor at an approximately constant altitude of 2m, with on-board strobes used to provide artificial lighting. The bathymetry is The multimodal relationship is captured using the gated learning architecture outlined in [3]. Features are extracted for each modality, and a gated model [7] is trained on the concatenation of these features.…”

Section: Multimodal Learningmentioning

confidence: 99%

“…We utilise a gated multimodal learning model [3] to model the relationship between the two modalities and to predict visual image features from the bathymetric data. We then put forward novel derivations of two informationtheoretic measures to aid survey planning, our key contribution.…”

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Multimodal information-theoretic measures for autonomous exploration

Rao

Bender

Williams

et al. 2016

2016 IEEE International Conference on Robotics and Automation (ICRA)

Self Cite

View full text Add to dashboard Cite

Autonomous underwater vehicles (AUVs) are widely used to perform information gathering missions in unseen environments. Given the sheer size of the ocean environment, and the time and energy constraints of an AUV, it is important to consider the potential utility of candidate missions when performing survey planning. In this paper, we utilise a multimodal learning approach to capture the relationship between in-situ visual observations, and shipborne bathymetry (ocean depth) data that are freely available a priori. We then derive information-theoretic measures under this model that predict the amount of visual information gain at an unobserved location based on the bathymetric features. Unlike previous approaches, these measures consider the value of additional visual features, rather than just the habitat labels obtained. Experimental results with a toy dataset and real marine data demonstrate that the approach can be used to predict the true utility of unexplored areas.

show abstract

Section: Related Work a Multimodal Learningmentioning

confidence: 99%

Section: Related Work a Multimodal Learningmentioning

confidence: 99%

Section: Multimodal Learningmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Multimodal information-theoretic measures for autonomous exploration

Rao

Bender

Williams

et al. 2016

2016 IEEE International Conference on Robotics and Automation (ICRA)

Self Cite

View full text Add to dashboard Cite

show abstract

“…The current vehicle telematics systems produce big volumes of multimodal data [2], which is predominantly stored in cloud-based datacenters and processed by traditional analytics tools. Consequently, vehicle management is becoming a highly data-centric business.…”

Section: Introductionmentioning

confidence: 99%

Real-Time Data-Intensive Telematics Functionalities at the Extreme Edge of the Network: Experience with the PrEstoCloud Project

Taherizadeh

Novak

Komatar

et al. 2018

2018 IEEE 42nd Annual Computer Software and Applications Conference (COMPSAC)

View full text Add to dashboard Cite

Abstract-In recent years, use of different sensors connected to vehicles is dramatically increasing in order to enhance transportation efficiency. The current Big Data technologies are predominantly used to store large amount of telematics data especially in the cloud, and they are only able to perform simple querying for the purpose of reporting. While all the data is stored in the cloud-centric datacenters, these telematics systems are not capable of exploiting other functionalities offered by advanced real-time analytics such as run-time anomaly detection. In this paper, we propose an advanced telematics system orchestrated upon an edge computing framework in the context of the PrEstoCloud (Proactive Cloud Resources Management at the Edge for Efficient Real-Time Big Data Processing) project. This telematics system is a real-time data-intensive application running at the extreme edge of the network for drivers' behavior profiling and triggering run-time alerts. Such functionalities may be useful in order to notify stakeholders for example drivers and logistic centers on situations where a possible accident may occur or attention is required.

show abstract

Multimodal obstacle detection in unstructured environments with conditional random fields

Kragh

Underwood

2019

Journal of Field Robotics

View full text Add to dashboard Cite

Reliable obstacle detection and classification in rough and unstructured terrain such as agricultural fields or orchards remains a challenging problem. These environments involve large variations in both geometry and appearance, challenging perception systems that rely on only a single sensor modality. Geometrically, tall grass, fallen leaves, or terrain roughness can mistakenly be perceived as nontraversable or might even obscure actual obstacles. Likewise, traversable grass or dirt roads and obstacles such as trees and bushes might be visually ambiguous. In this paper, we combine appearance‐ and geometry‐based detection methods by probabilistically fusing lidar and camera sensing with semantic segmentation using a conditional random field. We apply a state‐of‐the‐art multimodal fusion algorithm from the scene analysis domain and adjust it for obstacle detection in agriculture with moving ground vehicles. This involves explicitly handling sparse point cloud data and exploiting both spatial, temporal, and multimodal links between corresponding 2D and 3D regions. The proposed method was evaluated on a diverse data set, comprising a dairy paddock and different orchards gathered with a perception research robot in Australia. Results showed that for a two‐class classification problem (ground and nonground), only the camera leveraged from information provided by the other modality with an increase in the mean classification score of 0.5%. However, as more classes were introduced (ground, sky, vegetation, and object), both modalities complemented each other with improvements of 1.4% in 2D and 7.9% in 3D. Finally, introducing temporal links between successive frames resulted in improvements of 0.2% in 2D and 1.5% in 3D.

show abstract

Multimodal learning and inference from visual and remotely sensed data

Cited by 33 publications

References 42 publications

Multimodal information-theoretic measures for autonomous exploration

Multimodal information-theoretic measures for autonomous exploration

Real-Time Data-Intensive Telematics Functionalities at the Extreme Edge of the Network: Experience with the PrEstoCloud Project

Multimodal obstacle detection in unstructured environments with conditional random fields

Contact Info

Product

Resources

About