Recovering a large matrix from a small subset of its entries is a challenging problem arising in many real applications, such as image inpainting and recommender systems. Many existing approaches formulate this problem as a general low-rank matrix approximation problem. Since the rank operator is nonconvex and discontinuous, most of the recent theoretical studies use the nuclear norm as a convex relaxation. One major limitation of the existing approaches based on nuclear norm minimization is that all the singular values are simultaneously minimized, and thus the rank may not be well approximated in practice. In this paper, we propose to achieve a better approximation to the rank of matrix by truncated nuclear norm, which is given by the nuclear norm subtracted by the sum of the largest few singular values. In addition, we develop a novel matrix completion algorithm by minimizing the Truncated Nuclear Norm. We further develop three efficient iterative procedures, TNNR-ADMM, TNNR-APGL, and TNNR-ADMMAP, to solve the optimization problem. TNNR-ADMM utilizes the alternating direction method of multipliers (ADMM), while TNNR-AGPL applies the accelerated proximal gradient line search method (APGL) for the final optimization. For TNNR-ADMMAP, we make use of an adaptive penalty according to a novel update rule for ADMM to achieve a faster convergence rate. Our empirical study shows encouraging results of the proposed algorithms in comparison to the state-of-the-art matrix completion algorithms on both synthetic and real visual datasets.
Robotic-assisted tracheal intubation requires the robot to distinguish anatomical features like an experienced physician using deep-learning techniques. However, real datasets of oropharyngeal organs are limited due to patient privacy issues, making it challenging to train deep-learning models for accurate image segmentation. We hereby consider generating a new data modality through a virtual environment to assist the training process. Specifically, this work introduces a virtual dataset generated by the Simulation Open Framework Architecture (SOFA) framework to overcome the limited availability of actual endoscopic images. We also propose a domain adaptive Sim-to-Real method for oropharyngeal organ image segmentation, which employs an image blending strategy called IoU-Ranking Blend (IRB) and style-transfer techniques to address discrepancies between datasets. Experimental results demonstrate the superior performance of the proposed approach with domain adaptive models, improving segmentation accuracy and training stability. In the practical application, the trained segmentation model holds great promise for robotassisted intubation surgery and intelligent surgical navigation.
Can our video understanding systems perceive objects when a heavy occlusion exists in a scene? To answer this question, we collect a large-scale dataset called OVIS for occluded video instance segmentation, that is, to simultaneously detect, segment, and track instances in occluded scenes. OVIS consists of 296k high-quality instance masks from 25 semantic categories, where object occlusions usually occur. While our human vision systems can understand those occluded instances by contextual reasoning and association, our experiments suggest that current video understanding systems cannot. On the OVIS dataset, the highest AP achieved by state-of-the-art algorithms is only 16.3, which reveals that we are still at a nascent stage for understanding objects, instances, and videos in a real-world scenario. We also present a simple plug-and-play module that performs temporal feature calibration to complement missing object cues caused by occlusion. Built upon MaskTrack R-CNN and SipMask, we obtain a remarkable AP improvement on the OVIS dataset. The OVIS dataset and project code are available at http://songbai.site/ovis.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.