We address cross-species 3D face morphing (i.e., 3D face morphing from human to animal), a novel problem with promising applications in social media and movie industry. It remains challenging how to preserve target structural information and source fine-grained facial details simultaneously. To this end, we propose an Alignment-aware 3D Face Morphing (AFM) framework, which builds semantic-adaptive correspondence between source and target faces across species, via an alignment-aware controller mesh (Explicit Controller, EC) with explicit source/target mesh binding. Based on EC, we introduce Controller-Based Mapping (CBM), which builds semantic consistency between source and target faces according to the semantic importance of different face regions. Additionally, an inference-stage coarse-to-fine strategy is exploited to produce fine-grained meshes with rich facial details from rough meshes. Extensive experimental results in multiple people and animals demonstrate that our method produces high-quality deformation results.
Dynamic convolution has achieved significant gain in performance and computational complexity, thanks to its powerful representation capability given limited filter number/layers. However, SOTA dynamic convolution operators are sensitive to input noises (e.g., Gaussian noise, shot noise, e.t.c.) and lack sufficient spatial contextual information in filter generation. To alleviate this inherent weakness, we propose a lightweight and heterogeneous-structure (i.e., static and dynamic) operator, named Bi-volution. On the one hand, Bi-volution is designed as a dual-branch structure to fully leverage complementary properties of static/dynamic convolution, which endows Bi-volution more robust properties and higher performance. On the other hand, the Spatial Augmented Kernel Generation module is proposed to improve the dynamic convolution, realizing the learning of spatial context information with negligible additional computational complexity. Extensive experiments illustrate that the ResNet-50 equipped with Bi-volution achieves a highly competitive boost in performance (+2.8% top-1 accuracy on ImageNet classification, +2.4% box AP and +2.2% mask AP on COCO detection and instance segmentation) while maintaining extremely low FLOPs (i.e., ResNet50@2.7 GFLOPs). Furthermore, our Bi-volution shows better robustness than dynamic convolution against various noise and input corruptions. Our code is available at https://github.com/neuralchen/Bivolution.
In this paper, we give a new definition for sample complexity, and further develop a theoretical analysis to bridge the gap between sample complexity and model capacity. In contrast to previous works which study on some toy samples, we conduct our analysis on more general data space, and build a qualitative relationship from sample complexity to model capacity required to achieve comparable performance. Besides, we introduce a simple indicator to evaluate the sample complexity based on continuous mapping. Moreover, we further analysis the relationship between sample complexity and data distribution, which paves the way to understand the present representation learning. Extensive experiments on several datasets well demonstrate the effectiveness of our evaluation method.
PurposeThe purpose of the study was to investigate the changes of choroidal blood perfusion in different layers and quadrants and its possible related factors after 1 h visual task by augmented reality (AR) device in two‐dimensional (2D) and three‐dimensional (3D) mode, respectively.MethodsThirty healthy subjects aged 22–37 years watched the same video source in 2D and 3D mode separately using AR glasses for 1 h with a one‐week interval. Swept‐source optical coherence tomography angiography (SS‐OCTA) was performed before and immediately after watching to acquire choroidal thickness (ChT), three‐dimensional choroidal vascularity index (CVI) of large‐ and middle‐sized choroidal vessels and choriocapillaris flow voids (FV%) at macular and peripapillary area. Near point of accommodation (NPA) and accommodative facility (AF) were examined to evaluate the accommodative ability. Pupil diameters by infrared‐automated pupillometer under scotopic, mesopic and photopic condition were also obtained.ResultsCompared with pre‐visual task, the subfoveal CVI decreased from 0.406 ± 0.097 to 0.360 ± 0.102 after 2D watching (p < 0.001) and to 0.368 ± 0.102 after 3D watching (p = 0.002). Pupil sizes under different illuminance conditions became smaller after both 2D and 3D watching (all p < 0.001). AF increased after both 2D and 3D watching (both p < 0.05). NPA receded in post‐3D watching (p = 0.017) while a not significant tendency was observed in post‐2D.ConclusionA reduction in subfoveal choroidal blood flow accompanied with pupil constriction was observed immediately after 1 h visual task using AR glasses in 2D and 3D mode. Accommodative facility improved after 2D and 3D watching with AR glasses, whereas decrease in the maximum accommodation power was only found in 3D mode.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.