Unifying terrain awareness through real-time semantic segmentation

Yang, Kailun; Bergasa, Luis M.; Romera, Eduardo; Cheng, Ruiqi; Chen, Tianxue; Wang, Kaiwei

doi:10.1109/ivs.2018.8500506

Cited by 49 publications

(25 citation statements)

References 19 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…To improve navigation and orientation of people with visual impairment, some studies investigated the functionality of navigational assistance for visually impaired individuals ( 8 – 14 ). Besides the above approaches, some studies are more focused on devices that provide vibration and acoustic feedback to improve object detection for people with visual impairment ( 15 – 20 ).…”

Section: Literature Reviewmentioning

confidence: 99%

Looking through the Perceptions of Blinds: Potential Impacts of Connected Autonomous Vehicles on Pedestrians with Visual Impairment

Soldouz

Hasnine

Sukhai

et al. 2020

Transportation Research Record

View full text Add to dashboard Cite

The paper investigates the impacts and barriers posed by connected autonomous vehicles (CAVs) for pedestrians with visual impairment. This study uses a customized web-based survey of visually impaired people from Canada and abroad. Collected data are used to estimate econometric models to identify the critical factors that affect the level of trust in CAVs and the preference for using CAVs from the visually impaired individuals’ perspective. Separate models are estimated for Canadian and non-Canadian samples, as Canadian and non-Canadian participants show some differences in perception and positive attitude towards CAVs. The models reveal that the majority of the respondents prefer to get feedback and alerts from CAVs. Congenitally blind Canadians are less likely to trust CAVs, but non-Canadian congenital blinds tend to trust CAVs. The models also indicate that the respondents who experienced being near an accident with an electric vehicle (EV) are less likely to choose CAVs. Respondents who rely on mobile applications and technology-based devices for navigating purposes tend to trust CAVs. Blind people who rely on conventional navigation tools (e.g., white cane, guide dog, etc.) are less likely to be the users of CAVs. Gender effect is visible, as the female participants tend not to trust CAVs. In relation to policy recommendations, subsidies should be provided to various advocacy groups to offer orientation and mobility (O&M) training services, which are pivotal to educate how to use technology-based navigational services. Also, automobile manufacturers should be enforced to add acoustic vehicle alert systems (AVAS) to both EVs and CAVs.

show abstract

Section: Literature Reviewmentioning

confidence: 99%

Looking through the Perceptions of Blinds: Potential Impacts of Connected Autonomous Vehicles on Pedestrians with Visual Impairment

Soldouz

Hasnine

Sukhai

et al. 2020

Transportation Research Record

View full text Add to dashboard Cite

show abstract

“…Semantic segmentation is a basic task of computer vision, whose purpose is to partition an image into several coherent semantically-meaningful parts. Compared with traditional approaches that need to be deployed in complex separate ways, semantic segmentation can be utilized to unify diverse detection tasks desired by navigation systems, at least in standard outdoor conditions [1] [2].…”

Section: Introductionmentioning

confidence: 99%

ACNET: Attention Based Network to Exploit Complementary Features for RGBD Semantic Segmentation

Yang

Lei

et al. 2019

2019 IEEE International Conference on Image Processing (ICIP)

Self Cite

286

138

View full text Add to dashboard Cite

Compared to RGB semantic segmentation, RGBD semantic segmentation can achieve better performance by taking depth information into consideration. However, it is still problematic for contemporary segmenters to effectively exploit RGBD information since the feature distributions of RGB and depth (D) images vary significantly in different scenes. In this paper, we propose an Attention Complementary Network (ACNet) that selectively gathers features from RGB and depth branches. The main contributions lie in the Attention Complementary Module (ACM) and the architecture with three parallel branches. More precisely, ACM is a channel attention-based module that extracts weighted features from RGB and depth branches. The architecture preserves the inference of the original RGB and depth branches, and enables the fusion branch at the same time. Based on the above structures, ACNet is capable of exploiting more high-quality features from different channels. We evaluate our model on SUN-RGBD and NYUDv2 datasets, and prove that our model outperforms state-of-the-art methods. In particular, a mIoU score of 48.3% on NYUDv2 test set is achieved with ResNet50. We will release our source code based on PyTorch and the trained segmentation model at https://github.com/anheidelonghu/ACNet.

show abstract

“…Similarly, information about the surrounding terrain may help a robot modify its course of action during autonomous navigation [2], [6]. In the field of assistive robotics [7], the robot can warn a visually impaired person of potential danger concerning the ground type [8], [9].…”

Section: Introductionmentioning

confidence: 99%

Self-Supervised Audio-Visual Feature Learning for Single-Modal Incremental Terrain Type Clustering

2021

View full text Add to dashboard Cite

The key to an accurate understanding of terrain is to extract the informative features from the multi-modal data obtained from different devices. Sensors, such as RGB cameras, depth sensors, vibration sensors, and microphones, are used as the multi-modal data. Many studies have explored ways to use them, especially in the robotics field. Some papers have successfully introduced single-modal or multi-modal methods. However, in practice, robots can be faced with extreme conditions; microphones do not work well in crowded scenes, and an RGB camera cannot capture terrains well in the dark. In this paper, we present a novel framework using the multi-modal variational autoencoder and the Gaussian mixture model clustering algorithm on image data and audio data for terrain type clustering by forcing the features to be closer together in the feature space. Our method enables the terrain type clustering even if one of the modalities (either image or audio) is missing at the test-time. We evaluated the clustering accuracy with a conventional multi-modal terrain type clustering method and we conducted ablation studies to show the effectiveness of our approach. INDEX TERMSSelf-supervised, Terrain type clustering, Multi-modal learning Training Testing OR FIGURE 1: Overview of our terrain clustering framework.We train the model to extract the features from audio-visual data in a self-supervised manner. At the testing, we assume that only a single modality (either audio or visual) can be accessed due to the extreme conditions, the obtained data is incrementally clustered into terrain types.

show abstract

Unifying terrain awareness through real-time semantic segmentation

Cited by 49 publications

References 19 publications

Looking through the Perceptions of Blinds: Potential Impacts of Connected Autonomous Vehicles on Pedestrians with Visual Impairment

Looking through the Perceptions of Blinds: Potential Impacts of Connected Autonomous Vehicles on Pedestrians with Visual Impairment

ACNET: Attention Based Network to Exploit Complementary Features for RGBD Semantic Segmentation

Self-Supervised Audio-Visual Feature Learning for Single-Modal Incremental Terrain Type Clustering

Contact Info

Product

Resources

About