Ruslan Rakhimov scite author profile

The video generation task can be formulated as a prediction of future video frames given some past frames. Recent generative models for videos face the problem of high computational requirements. Some models require up to 512 Tensor Processing Units for parallel training. In this work, we address this problem via modeling the dynamics in a latent space. After the transformation of frames into the latent space, our model predicts latent representation for the next frames in an autoregressive manner. We demonstrate the performance of our approach on BAIR Robot Pushing and Kinetics-600 datasets. The approach tends to reduce requirements to 8 Graphical Processing Units for training the models while maintaining comparable generation quality.

show abstract

Latent Video Transformer

Rakhimov

Volkhonskiy

Artemov

et al. 2021

View full text Add to dashboard Cite

$\text{NPBG++}$: Accelerating Neural Point-Based Graphics

Rakhimov

Ardelean

Lempitsky

et al. 2022

View full text Add to dashboard Cite

DEF: Deep Estimation of Sharp Geometric Features in 3D Shapes

Matveev¹,

Rakhimov²,

Artemov³

et al. 2020

Preprint

View full text Add to dashboard Cite

Def

et al. 2022

View full text Add to dashboard Cite

We propose Deep Estimators of Features (DEFs), a learning-based framework for predicting sharp geometric features in sampled 3D shapes. Differently from existing data-driven methods, which reduce this problem to feature classification, we propose to regress a scalar field representing the distance from point samples to the closest feature line on local patches. Our approach is the first that scales to massive point clouds by fusing distance-to-feature estimates obtained on individual patches. We extensively evaluate our approach against related state-of-the-art methods on newly proposed synthetic and real-world 3D CAD model benchmarks. Our approach not only outperforms these (with improvements in Recall and False Positives Rates), but generalizes to real-world scans after training our model on synthetic data and fine-tuning it on a small dataset of scanned data. We demonstrate a downstream application, where we reconstruct an explicit representation of straight and curved sharp feature lines from range scan data. We make code, pre-trained models, and our training and evaluation datasets available at https://github.com/artonson/def.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Ruslan Rakhimov

Latent Video Transformer

Latent Video Transformer

$\text{NPBG++}$: Accelerating Neural Point-Based Graphics

DEF: Deep Estimation of Sharp Geometric Features in 3D Shapes

Def

Contact Info

Product

Resources

About