This paper proposes an end-to-end trainable network, SegFlow, for simultaneously predicting pixel-wise object segmentation and optical flow in videos. The proposed SegFlow has two branches where useful information of object segmentation and optical flow is propagated bidirectionally in a unified framework. The segmentation branch is based on a fully convolutional network, which has been proved effective in image segmentation task, and the optical flow branch takes advantage of the FlowNet model. The unified framework is trained iteratively offline to learn a generic notion, and fine-tuned online for specific objects. Extensive experiments on both the video object segmentation and optical flow datasets demonstrate that introducing optical flow improves the performance of segmentation and vice versa, against the state-of-the-art algorithms.
Online video object segmentation is a challenging task as it entails to process the image sequence timely and accurately. To segment a target object through the video, numerous CNN-based methods have been developed by heavily finetuning on the object mask in the first frame, which is time-consuming for online applications. In this paper, we propose a fast and accurate video object segmentation algorithm that can immediately start the segmentation process once receiving the images. We first utilize a partbased tracking method to deal with challenging factors such as large deformation, occlusion, and cluttered background. Based on the tracked bounding boxes of parts, we construct a region-of-interest segmentation network to generate part masks. Finally, a similarity-based scoring function is adopted to refine these object parts by comparing them to the visual information in the first frame. Our method performs favorably against state-of-the-art algorithms in accuracy on the DAVIS benchmark dataset, while achieving much faster runtime performance.
The cloud radio access network (Cloud-RAN) has recently been proposed as one cost-effective and energy-efficient technique for 5G wireless networks. By moving the signal processing functionality to a single baseband unit (BBU) pool, centralized signal processing and resource allocation are enabled in Cloud-RAN, thereby providing the promise of improving the energy efficiency via effective network adaptation and interference management. In this paper, we propose a holistic sparse optimization framework to design green Cloud-RAN by taking into consideration the power consumption of the fronthaul links, multicast services, as well as user admission control. Specifically, we first identify the sparsity structures in the solutions of both the network power minimization and user admission control problems, which call for adaptive remote radio head (RRH) selection and user admission. However, finding the optimal sparsity structures turns out to be NP-hard, with the coupled challenges of the ℓ0-norm based objective functions and the nonconvex quadratic QoS constraints due to multicast beamforming. In contrast to the previous works on convex but non-smooth sparsity inducing approaches, e.g., the group sparse beamforming algorithm based on the mixed ℓ1/ℓ2-norm relaxation [1], we adopt the nonconvex but smoothed ℓp-minimization (0 < p ≤ 1) approach to promote sparsity in the multicast setting, thereby enabling efficient algorithm design based on the principle of the majorization-minimization (MM) algorithm and the semidefinite relaxation (SDR) technique. In particular, an iterative reweighted-ℓ2 algorithm is developed, which will converge to a Karush-Kuhn-Tucker (KKT) point of the relaxed smoothed ℓp-minimization problem from the SDR technique. We illustrate the effectiveness of the proposed algorithms with extensive simulations for network power minimization and user admission control in multicast Cloud-RAN.Index Terms-5G networks, green communications, Cloud-RAN, multicast beamforming, sparse optimization, semidefinite relaxation, smoothed ℓp-minimization, and user admission control.Manuscript received xxx; revised xxx; accepted xxx. Date of publication xxx; date of current version xxx.
Fine-tuning pre-trained deep networks is a practical way of benefiting from the representation learned on a large database while having relatively few examples to train a model. This adjustment is nowadays routinely performed so as to benefit of the latest improvements of convolutional neural networks trained on large databases. Fine-tuning requires some form of regularization, which is typically implemented by weight decay that drives the network parameters towards zero. This choice conflicts with the motivation for fine-tuning, as starting from a pre-trained solution aims at taking advantage of the previously acquired knowledge. Hence, regularizers promoting an explicit inductive bias towards the pre-trained model have been recently proposed. This paper demonstrates the versatility of this type of regularizer across transfer learning scenarios. We replicated experiments on three state-of-the-art approaches in image classification, image segmentation, and video analysis to compare the relative merits of regularizers. These tests show systematic improvements compared to weight decay. Our experimental protocol put forward the versatility of a regularizer that is easy to implement and to operate that we eventually recommend as the new baseline for future approaches to transfer learning relying on fine-tuning.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.