Wenjie Zi scite author profile

Wenjie Zi

5Publications

32Citation Statements Received

31Citation Statements Given

How they've been cited

114

How they cite others

114

Affiliations

National University of Defense Technology

Publications

Order By: Most citations

Optimizing Deeper Transformers on Small Datasets

Xu¹,

Kumar²,

Yang³

et al. 2021

View full text Add to dashboard Cite

It is a common belief that training deep transformers from scratch requires large datasets. Consequently, for small datasets, people usually use shallow and simple additional layers on top of pre-trained models during finetuning. This work shows that this does not always need to be the case: with proper initialization and optimization, the benefits of very deep transformers can carry over to challenging tasks with small datasets, including Textto-SQL semantic parsing and logical reading comprehension. In particular, we successfully train 48 layers of transformers, comprising 24 fine-tuned layers from pre-trained RoBERTa and 24 relation-aware layers trained from scratch. With fewer training steps and no task-specific pre-training, we obtain the state-of-the-art performance on the challenging cross-domain Text-to-SQL parsing benchmark Spider 1 . We achieve this by deriving a novel Data-dependent Transformer Fixedupdate initialization scheme (DT-Fixup), inspired by the prior T-Fixup work (Huang et al., 2020). Further error analysis shows that increasing depth can help improve generalization on small datasets for hard cases that require reasoning and structural understanding.

show abstract

TAGCN: Station-level demand prediction for bike-sharing system via a temporal attention graph convolution network

Xiong

Chen

et al. 2021

Information Sciences

View full text Add to dashboard Cite

SGA-Net: Self-Constructing Graph Attention Neural Network for Semantic Segmentation of Remote Sensing Images

Xiong

Chen

et al. 2021

Remote Sensing

View full text Add to dashboard Cite

Semantic segmentation of remote sensing images is always a critical and challenging task. Graph neural networks, which can capture global contextual representations, can exploit long-range pixel dependency, thereby improving semantic segmentation performance. In this paper, a novel self-constructing graph attention neural network is proposed for such a purpose. Firstly, ResNet50 was employed as backbone of a feature extraction network to acquire feature maps of remote sensing images. Secondly, pixel-wise dependency graphs were constructed from the feature maps of images, and a graph attention network is designed to extract the correlations of pixels of the remote sensing images. Thirdly, the channel linear attention mechanism obtained the channel dependency of images, further improving the prediction of semantic segmentation. Lastly, we conducted comprehensive experiments and found that the proposed model consistently outperformed state-of-the-art methods on two widely used remote sensing image datasets.

show abstract

DRE-Net: A Dynamic Radius-Encoding Neural Network with an Incremental Training Strategy for Interactive Segmentation of Remote Sensing Images

Yang

Chen

et al. 2023

Remote Sensing

View full text Add to dashboard Cite

Semantic segmentation of remote sensing (RS) images, which is a fundamental research topic, classifies each pixel in an image. It plays an essential role in many downstream RS areas, such as land-cover mapping, road extraction, traffic monitoring, and so on. Recently, although deep-learning-based methods have shown their dominance in automatic semantic segmentation of RS imagery, the performance of these existing methods has relied heavily on large amounts of high-quality training data, which are usually hard to obtain in practice. Moreover, human-in-the-loop semantic segmentation of RS imagery cannot be completely replaced by automatic segmentation models, since automatic models are prone to error in some complex scenarios. To address these issues, in this paper, we propose an improved, smart, and interactive segmentation model, DRE-Net, for RS images. The proposed model facilitates humans’ performance of segmentation by simply clicking a mouse. Firstly, a dynamic radius-encoding (DRE) algorithm is designed to distinguish the purpose of each click, such as a click for the selection of a segmentation outline or for fine-tuning. Secondly, we propose an incremental training strategy to cause the proposed model not only to converge quickly, but also to obtain refined segmentation results. Finally, we conducted comprehensive experiments on the Potsdam and Vaihingen datasets and achieved 9.75% and 7.03% improvements in NoC95 compared to the state-of-the-art results, respectively. In addition, our DRE-Net can improve the convergence and generalization of a network with a fast inference speed.

show abstract

Optimizing Deeper Transformers on Small Datasets

Kumar

Yang

et al. 2020

Preprint

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Wenjie Zi

Optimizing Deeper Transformers on Small Datasets

TAGCN: Station-level demand prediction for bike-sharing system via a temporal attention graph convolution network

SGA-Net: Self-Constructing Graph Attention Neural Network for Semantic Segmentation of Remote Sensing Images

DRE-Net: A Dynamic Radius-Encoding Neural Network with an Incremental Training Strategy for Interactive Segmentation of Remote Sensing Images

Optimizing Deeper Transformers on Small Datasets

Contact Info

Product

Resources

About