Recent deep networks achieved state of the art performance on a variety of semantic segmentation tasks. Despite such progress, these models often face challenges in real world "wild tasks" where large difference between labeled training/source data and unseen test/target data exists. In particular, such difference is often referred to as "domain gap", and could cause significantly decreased performance which cannot be easily remedied by further increasing the representation power. Unsupervised domain adaptation (UDA) seeks to overcome such problem without target domain labels. In this paper, we propose a novel UDA framework based on an iterative self-training procedure, where the problem is formulated as latent variable loss minimization, and can be solved by alternatively generating pseudo labels on target data and re-training the model with these labels. On top of self-training, we also propose a novel class-balanced self-training framework to avoid the gradual dominance of large classes in pseudo-label generation, and introduce spatial priors to refine the generated pseudo-labels. Comprehensive experiments show that the proposed methods achieve state of the art semantic segmentation performance under multiple major UDA settings.
Recent advances in domain adaptation show that deep self-training presents a powerful means for unsupervised domain adaptation. These methods often involve an iterative process of predicting on target domain and then taking the confident predictions as pseudo-labels for retraining. However, since pseudo-labels can be noisy, self-training can put overconfident label belief on wrong classes, leading to deviated solutions with propagated errors. To address the problem, we propose a confidence regularized self-training (CRST) framework, formulated as regularized self-training. Our method treats pseudo-labels as continuous latent variables jointly optimized via alternating optimization. We propose two types of confidence regularization: label regularization (LR) and model regularization (MR). CRST-LR generates soft pseudo-labels while CRST-MR encourages the smoothness on network output. Extensive experiments on image classification and semantic segmentation show that CRSTs outperform their non-regularized counterpart with state-of-the-art performance. The code and models of this work are available at https://github.com/yzou2/CRST. Network retraining (b) Label regularized self-training (c) Model regularized self-training (a) Self-training without confidence regularization Backbone Network Car Person Bus Network retraining Network retraining Figure 1: Illustration of proposed confidence regularization. (a) Self-training without confidence regularization generates and retrains with hard pseudo-labels, resulting in sharp network output. (b) Label regularized self-training introduces soft pseudo-labels, therefore enabling outputs to be smooth. (c) Model regularized self-training also retrains with hard pseudo-labels, but incorporates a regularizer to directly promote output smoothness.More recently, self-training with networks emerged as a promising alternative towards domain adaptation [4,5,25,29,49,54,69]. Self-training iteratively generates a set of one-hot (or hard) pseudo-labels corresponding to large selection scores (i.e., prediction confidence) in target domain, and then retrains network based on these pseudo-labels with target data. Recently, [69] proposes class-balanced selftraining (CBST) and formulates self-training as a unified loss minimization with pseudo-labels that can be solved in an end-to-end manner. Instead of reducing domain gap by minimizing both the task loss and domain adversarial loss, the self-training loss implicitly encourages cross-domain feature alignment for each class by learning from both labeled source data and pseudo-labeled target data.Early work [29] shows that the essence of deep selftraining is entropy minimization -pushing network output to be as sharp as hard pseudo-label. However, 100% accuracy cannot always be guaranteed for pseudo-labels. Trusting all selected pseudo-labels as "ground truth" by encoding
The synthesis of a new category of spatial filters that produces sharp output correlation peaks with controlled peak values is considered. The sharp nature of the correlation peak is the major feature emphasized, since it facilitates target detection. Since these filters minimize the average correlation plane energy as the first step in filter synthesis, we refer to them as minimum average correlation energy filters. Experimental laboratory results from optical implementation of the filters are also presented and discussed.
While deep learning methods have achieved state-of-the-art performance in many challenging inverse problems like image inpainting and super-resolution, they invariably involve problem-specific training of the networks. Under this approach, different problems require different networks. In scenarios where we need to solve a wide variety of problems, e.g., on a mobile camera, it is inefficient and costly to use these specially-trained networks. On the other hand, traditional methods using signal priors can be used in all linear inverse problems but often have worse performance on challenging tasks. In this work, we provide a middle ground between the two kinds of methods -we propose a general framework to train a single deep neural network that solves arbitrary linear inverse problems. The proposed network acts as a proximal operator for an optimization algorithm and projects non-image signals onto the set of natural images defined by the decision boundary of a classifier. In our experiments, the proposed framework demonstrates superior performance over traditional methods using a wavelet sparsity prior and achieves comparable performance of specially-trained networks on tasks including compressive sensing and pixel-wise inpainting.
A mathematical analysis of the distortion tolerance in correlation filters is presented. A good measure for distortion performance is shown to be a generalization of the minimum average correlation energy criterion. To optimize the filter's performance, we remove the usual hard constraints on the outputs in the synthetic discriminant function formulation. The resulting filters exhibit superior distortion tolerance while retaining the attractive features of their predecessors such as the minimum average correlation energy filter and the minimum variance synthetic discriminant function filter. The proposed theory also unifies several existing approaches and examines the relationship between different formulations. The proposed filter design algorithm requires only simple statistical parameters and the inversion of diagonal matrices, which makes it attractive from a computational standpoint. Several properties of these filters are discussed with illustrative examples.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.