Mikhail Romanov scite author profile

In a multi-task learning (MTL) setting, a single model is trained to tackle a diverse set of tasks jointly. Despite rapid progress in the field, MTL remains challenging due to optimization issues such as conflicting and dominating gradients. In this work, we propose using a condition number of a linear system of gradients as a stability criterion of an MTL optimization. We theoretically demonstrate that a condition number reflects the aforementioned optimization issues. Accordingly, we present Aligned-MTL, a novel MTL optimization approach based on the proposed criterion, that eliminates instability in the training process by aligning the orthogonal components of the linear system of gradients. While many recent MTL approaches guarantee convergence to a minimum, task trade-offs cannot be specified in advance. In contrast, Aligned-MTL provably converges to an optimal point with pre-defined task-specific weights, which provides more control over the optimization result. Through experiments, we show that the proposed approach consistently improves performance on a diverse set of MTL benchmarks, including semantic and instance segmentation, depth estimation, surface normal estimation, and reinforcement learning. The source code is publicly available at https://github.com/SamsungLabs/MTL.

show abstract

Double Refinement Network for Efficient Monocular Depth Estimation

Durasov

Romanov²,

Bubnova³

et al. 2019

View full text Add to dashboard Cite

Monocular depth estimation is the task of obtaining a measure of distance for each pixel using a single image. It is an important problem in computer vision and is usually solved using neural networks. Though recent works in this area have shown significant improvement in accuracy, the state-of-the-art methods tend to require massive amounts of memory and time to process an image. The main purpose of this work is to improve the performance of the latest solutions with no decrease in accuracy. To this end, we introduce the Double Refinement Network architecture. The proposed method achieves state-of-the-art results on the standard benchmark RGB-D dataset NYU Depth v2, while its frames per second rate is significantly higher (up to 18 times speedup per image at batch size 1) and the RAM usage per image is lower.

show abstract

Double Refinement Network for Room Layout Estimation

Kruzhilov¹,

Romanov²,

Konushin³

2019

Preprint

View full text Add to dashboard Cite

Layout estimation is a challenge of segmenting a cluttered room image into floor, walls and ceiling. We applied Double refinement network proved to be efficient in the depth estimation to generate heat maps for room key points and edges. Our method is the first not using encoder-decoder architecture for the room layout estimation. ResNet50 was utilized as a backbone for the network instead of VGG16 commonly used for the task, allowing the network to be more compact and faster. We designed a special layout score function and layout ranking algorithm for key points and edges output. Our method achieved the lowest pixel and corner errors on the LSUN data set. The input image resolution is 224*224.

show abstract

Learning High-Resolution Domain-Specific Representations with a GAN Generator

Galeev

Sofiiuk

Rukhovich

et al. 2021

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Mikhail Romanov

Simultaneous tomographic reconstruction and segmentation with class priors

Decoder Modulation for Indoor Depth Completion

Double Refinement Network for Efficient Monocular Depth Estimation

Double Refinement Network for Room Layout Estimation

Learning High-Resolution Domain-Specific Representations with a GAN Generator

Contact Info

Product

Resources

About