Xinbang Zhang scite author profile

Recently Neural Architecture Search (NAS) has aroused great interest in both academia and industry, however it remains challenging because of its huge and non-continuous search space. Instead of applying evolutionary algorithm or reinforcement learning as previous works, this paper proposes a Direct Sparse Optimization NAS (DSO-NAS) method. In DSO-NAS, we provide a novel model pruning view to NAS problem. In specific, we start from a completely connected block, and then introduce scaling factors to scale the information flow between operations. Next, we impose sparse regularizations to prune useless connections in the architecture. Lastly, we derive an efficient and theoretically sound optimization method to solve it. Our method enjoys both advantages of differentiability and efficiency, therefore can be directly applied to large datasets like ImageNet. Particularly, On CIFAR-10 dataset, DSO-NAS achieves an average test error 2.84%, while on the ImageNet dataset DSO-NAS achieves 25.4% test error under 600M FLOPs with 8 GPUs in 18 hours. * The work was done when Xinbang Zhang was intern in TuSimple.

show abstract

EAT-NAS: elastic architecture transfer for accelerating large-scale neural architecture search

Fang

Chen

Zhang

et al. 2021

Sci. China Inf. Sci.

View full text Add to dashboard Cite

EAT-NAS: Elastic Architecture Transfer for Accelerating Large-scale Neural Architecture Search

Fang¹,

Chen²,

Zhang³

et al. 2019

Preprint

View full text Add to dashboard Cite

Neural architecture search (NAS) methods have been proposed to release human experts from tedious architecture engineering. However, most current methods are constrained in small-scale search due to the issue of computational resources. Meanwhile, directly applying architectures searched on small datasets to large datasets often bears no performance guarantee. This limitation impedes the wide use of NAS on large-scale tasks. To overcome this obstacle, we propose an elastic architecture transfer mechanism for accelerating large-scale neural architecture search (EAT-NAS). In our implementations, architectures are first searched on a small dataset, e.g., CIFAR-10. The best one is chosen as the basic architecture. The search process on the large dataset, e.g., ImageNet, is initialized with the basic architecture as the seed. The large-scale search process is accelerated with the help of the basic architecture. What we propose is not only a NAS method but a mechanism for architecture-level transfer.In our experiments, we obtain two final models EATNet-A and EATNet-B that achieve competitive accuracies, 74.7% and 74.2% on ImageNet, respectively, which also surpass the models searched from scratch on ImageNet under the same settings. For the computational cost, EAT-NAS takes only less than 5 days on 8 TITAN X GPUs, which is significantly less than the computational consumption of the state-of-the-art large-scale NAS methods. 1

show abstract

Multi-modal spatio-temporal meteorological forecasting with deep neural network

Zhang

Jin

et al. 2022

ISPRS Journal of Photogrammetry and Remote Sensing

View full text Add to dashboard Cite

Learning from the Target: Dual Prototype Network for Few Shot Semantic Segmentation

Mao

Zhang

Wang

et al. 2022

AAAI

View full text Add to dashboard Cite

Due to the scarcity of annotated samples, the diversity between support set and query set becomes the main obstacle for few shot semantic segmentation. Most existing prototype-based approaches only exploit the prototype from the support feature and ignore the information from the query sample, failing to remove this obstacle.In this paper, we proposes a dual prototype network (DPNet) to dispose of few shot semantic segmentation from a new perspective. Along with the prototype extracted from the support set, we propose to build the pseudo-prototype based on foreground features in the query image. To achieve this goal, the cycle comparison module is developed to select reliable foreground features and generate the pseudo-prototype with them. Then, a prototype interaction module is utilized to integrate the information of the prototype and the pseudo-prototype based on their underlying correlation. Finally, a multi-scale fusion module is introduced to capture contextual information during the dense comparison between prototype (pseudo-prototype) and query feature. Extensive experiments conducted on two benchmarks demonstrate that our method exceeds previous state-of-the-arts with a sizable margin, verifying the effectiveness of the proposed method.

show abstract

AutoMF: Spatio-temporal Architecture Search for The Meteorological Forecasting Task

Zhang

Jin

Xiang

et al. 2022

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Xinbang Zhang

DATA: Differentiable ArchiTecture Approximation With Distribution Guided Sampling

You Only Search Once: Single Shot Neural Architecture Search via Direct Sparse Optimization

You Only Search Once: Single Shot Neural Architecture Search via Direct Sparse Optimization

EAT-NAS: elastic architecture transfer for accelerating large-scale neural architecture search

EAT-NAS: Elastic Architecture Transfer for Accelerating Large-scale Neural Architecture Search

Multi-modal spatio-temporal meteorological forecasting with deep neural network

Learning from the Target: Dual Prototype Network for Few Shot Semantic Segmentation

AutoMF: Spatio-temporal Architecture Search for The Meteorological Forecasting Task

Contact Info

Product

Resources

About