Da Yang scite author profile

As one of the fundamental technologies for scene understanding, semantic segmentation has been widely explored in the last few years. Light field cameras encode the geometric information by simultaneously recording the spatial information and angular information of light rays, which provides us with a new way to solve this issue. In this paper, we propose a highquality and challenging urban scene dataset, containing 1074 samples composed of real-world and synthetic light field images as well as pixel-wise annotations for 14 semantic classes. To the best of our knowledge, it is the largest and the most diverse light field dataset for semantic segmentation. We further design two new semantic segmentation baselines tailored for light field and compare them with state-of-the-art RGB, video and RGB-D-based methods using the proposed dataset. The outperforming results of our baselines demonstrate the advantages of the geometric information in light field for this task. We also provide evaluations of super-resolution and depth estimation methods, showing that the proposed dataset presents new challenges and supports detailed comparisons among different methods. We expect this work inspires new research direction and stimulates scientific progress in related fields. The complete dataset is available at https://github.com/HAWKEYE-Group/UrbanLF.

show abstract

Extendable Multiple Nodes Recurrent Tracking Framework With RTU++

Wang

Sheng

Yang

et al. 2022

IEEE Trans. on Image Process.

View full text Add to dashboard Cite

Recently, tracking-by-detection has become a popular paradigm in Multiple-object tracking (MOT) for its concise pipeline. Many current works first associate the detections to form track proposals and then score proposals by manual functions to select the best. However, long-term tracking information is lost in this way due to detection failure or heavy occlusion. In this paper, the Extendable Multiple Nodes Tracking framework (EMNT) is introduced to model the association. Instead of detections, EMNT creates four basic types of nodes including correct, false, dummy and termination to generally model the tracking procedure. Further, we propose a General Recurrent Tracking Unit (RTU++) to score track proposals by capturing long-term information. In addition, we present an efficient generation method of simulated tracking data to overcome the dilemma of limited available data in MOT. The experiments show that our methods achieve state-of-the-art performance on MOT17, MOT20 and HiEve benchmarks. Meanwhile, RTU++ can be flexibly plugged into other trackers such as MHT, and bring significant improvements. The additional experiments on MOTS20 and CTMC-v1 also demonstrate the generalization ability of RTU++ trained by simulated data in various scenarios.

show abstract

LFRSNet: A robust light field semantic segmentation network combining contextual and geometric features

Yang

Zhu

Wang

et al. 2022

Front. Environ. Sci.

View full text Add to dashboard Cite

Light field (LF) semantic segmentation is a newly arisen technology and is widely used in many smart city applications such as remote sensing, virtual reality and 3D photogrammetry. Compared with RGB images, LF images contain multi-layer contextual information and rich geometric information of real-world scenes, which are challenging to be fully exploited because of the complex and highly inter-twined structure of LF. In this paper, LF Contextual Feature (LFCF) and LF Geometric Feature (LFGF) are proposed respectively for occluded area perception and segmentation edge refinement. With exploitation of all the views in LF, LFCF provides glimpse of some occluded areas from other angular positions besides the superficial color information of the target view. The multi-layer information of the occluded area enhances the classification of partly occluded objects. Whereas LFGF is extracted from Ray Epipolar-Plane Images (RayEPIs) in eight directions for geometric information embedding. The solid geometric information refines object edges, especially for occlusion boundaries with similar colors. At last, Light Field Robust Segmentation Network (LFRSNet) is designed to integrate LFCF and LFGF. Multi-layer contextual information and geometric information are effectively incorporated through LFRSNet, which brings significant improvement for segmentation of the occluded objects and the object edges. Experimental results on both realworld and synthetic datasets proves the state-of-the-art performance of our method. Compared with other methods, LFRSNet produces more accurate segmentation under occlusion, especially in the edge regions.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Da Yang

Occlusion-aware depth estimation for light field using multi-orientation EPIs

UrbanLF: A Comprehensive Light Field Dataset for Semantic Segmentation of Urban Scenes

Extendable Multiple Nodes Recurrent Tracking Framework With RTU++

LFRSNet: A robust light field semantic segmentation network combining contextual and geometric features

Contact Info

Product

Resources

About