Junliang Guo scite author profile

Non-autoregressive translation (NAT) models, which remove the dependence on previous target tokens from the inputs of the decoder, achieve significantly inference speedup but at the cost of inferior accuracy compared to autoregressive translation (AT) models. Previous work shows that the quality of the inputs of the decoder is important and largely impacts the model accuracy. In this paper, we propose two methods to enhance the decoder inputs so as to improve NAT models. The first one directly leverages a phrase table generated by conventional SMT approaches to translate source tokens to target tokens, which are then fed into the decoder as inputs. The second one transforms source-side word embeddings to target-side word embeddings through sentence-level alignment and word-level adversary learning, and then feeds the transformed word embeddings into the decoder as inputs. Experimental results show our method largely outperforms the NAT baseline (Gu et al. 2017) by 5.11 BLEU scores on WMT14 English-German task and 4.72 BLEU scores on WMT16 English-Romanian task.

show abstract

Adaptive Nearest Neighbor Machine Translation

Zheng¹,

Zhang²,

Guo³

et al. 2021

View full text Add to dashboard Cite

kNN-MT, recently proposed by Khandelwal et al. (2020a), successfully combines pretrained neural machine translation (NMT) model with token-level k-nearest-neighbor (kNN) retrieval to improve the translation accuracy. However, the traditional kNN algorithm used in kNN-MT simply retrieves a same number of nearest neighbors for each target token, which may cause prediction errors when the retrieved neighbors include noises. In this paper, we propose Adaptive kNN-MT to dynamically determine the number of k for each target token. We achieve this by introducing a light-weight Meta-k Network, which can be efficiently trained with only a few training samples. On four benchmark machine translation datasets, we demonstrate that the proposed method is able to effectively filter out the noises in retrieval results and significantly outperforms the vanilla kNN-MT model. Even more noteworthy is that the Meta-k Network learned on one domain could be directly applied to other domains and obtain consistent improvements, illustrating the generality of our method. Our implementation is open-sourced at https://github. com/zhengxxn/adaptive-knn-mt.

show abstract

Fine-Tuning by Curriculum Learning for Non-Autoregressive Neural Machine Translation

Guo

Tan

et al. 2020

AAAI

View full text Add to dashboard Cite

Non-autoregressive translation (NAT) models remove the dependence on previous target tokens and generate all target tokens in parallel, resulting in significant inference speedup but at the cost of inferior translation accuracy compared to autoregressive translation (AT) models. Considering that AT models have higher accuracy and are easier to train than NAT models, and both of them share the same model configurations, a natural idea to improve the accuracy of NAT models is to transfer a well-trained AT model to an NAT model through fine-tuning. However, since AT and NAT models differ greatly in training strategy, straightforward fine-tuning does not work well. In this work, we introduce curriculum learning into fine-tuning for NAT. Specifically, we design a curriculum in the fine-tuning process to progressively switch the training from autoregressive generation to non-autoregressive generation. Experiments on four benchmark translation datasets show that the proposed method achieves good improvement (more than 1 BLEU score) over previous NAT baselines in terms of translation accuracy, and greatly speed up (more than 10 times) the inference process over AT baselines.

show abstract

Jointly Masked Sequence-to-Sequence Model for Non-Autoregressive Neural Machine Translation

Guo¹,

Xu²,

Chen³

2020

View full text Add to dashboard Cite

The masked language model has received remarkable attention due to its effectiveness on various natural language processing tasks. However, few works have adopted this technique in the sequence-to-sequence models. In this work, we introduce a jointly masked sequence-to-sequence model and explore its application on non-autoregressive neural machine translation (NAT). Specifically, we first empirically study the functionalities of the encoder and the decoder in NAT models, and find that the encoder takes a more important role than the decoder regarding the translation quality. Therefore, we propose to train the encoder more rigorously by masking the encoder input while training. As for the decoder, we propose to train it based on the consecutive masking of the decoder input with an ngram loss function to alleviate the problem of translating duplicate words. The two types of masks are applied to the model jointly at the training stage. We conduct experiments on five benchmark machine translation tasks, and our model can achieve 27.69/32.24 BLEU scores on WMT14 English-German/German-English tasks with 5+ times speed up compared with an autoregressive model.

show abstract

Difformer: Empowering Diffusion Models on the Embedding Space for Text Generation

Gao¹,

Guo²,

Xu³

et al. 2022

Preprint

View full text Add to dashboard Cite

SPINE: Structural Identity Preserved Inductive Network Embedding

Guo

Liu

2019

View full text Add to dashboard Cite

Recent advances in the field of network embedding have shown that low-dimensional network representation is playing a critical role in network analysis. Most existing network embedding methods encode the local proximity of a node, such as the first-and second-order proximities. While being efficient, these methods are short of leveraging the global structural information between nodes distant from each other. In addition, most existing methods learn embeddings on one single fixed network, and thus cannot be generalized to unseen nodes or networks without retraining. In this paper we present SPINE, a method that can jointly capture the local proximity and proximities at any distance, while being inductive to efficiently deal with unseen nodes or networks. Extensive experimental results on benchmark datasets demonstrate the superiority of the proposed framework over the state of the art. * Corresponding Author Network Diffusions Communities

show abstract

SPINE: Structural Identity Preserved Inductive Network Embedding

Guo

Chen

2018

Preprint

View full text Add to dashboard Cite

Interface Microstructure and Mechanical Properties of Al/Steel Bimetallic Composites Fabricated by Liquid-Solid Casting with Rare Earth Eu Additions

et al. 2022

View full text Add to dashboard Cite

To improve the Al/Steel bimetallic interface, Eu was firstly added to the Al/Steel bimetallic interface made by liquid-solid casting. The effects of Eu addition on the microstructure, mechanical capacities, and rupture behavior of the Al/Steel bimetallic interface was studied in detail. As the addition of 0.1 wt.% Eu, the morphology of eutectic Si changed from coarse plate-like to fine fibrous and granular in Al-Si alloys, and the average thickness of the intermetallic compounds layer decreased to a minimum value of 7.96 μm. In addition, there was a more sudden drop of Fe in steel side and the Si in Al side was observed to be more than the other conditions. The addition of Eu did not change the kinds of intermetallic compounds in the Al/steel reaction layer, which was composed of Al5Fe2, τ1-(Al, Si)5Fe3, Al13Fe4, τ5-Al7Fe2Si, and τ6-Al9Fe2Si2 phases. The addition of the element Eu did not change the preferential orientation of the Al5Fe2, τ1-(Al, Si)5Fe3, Al13Fe4, τ5-Al7Fe2Si, and τ6-Al9Fe2Si2 phases, but refined the grain size of each phase and decreased the polar density of Al5Fe2 phase. Eu was mainly enriched in the front of the ternary compound layer (τ6-Al9Fe2Si2) near the Al side and steel matrix. The Fe and Al element distribution area tended to narrow in the interface after the addition of 0.1 wt.% Eu, which is probably because that Eu inhibits the spread of Al atoms along the c-axis direction of the Al5Fe2 phase and the growth of Al13Fe4, τ5-Al7Fe2Si, and τ6-Al9Fe2Si2 phases. When the Eu content was 0.1 wt.%, the shear strength of the Al/Steel bimetal achieved a maximum of 31.21 MPa, which was 47% higher than the bimetal without Eu.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Junliang Guo

Non-Autoregressive Neural Machine Translation with Enhanced Decoder Input

Adaptive Nearest Neighbor Machine Translation

Fine-Tuning by Curriculum Learning for Non-Autoregressive Neural Machine Translation

Jointly Masked Sequence-to-Sequence Model for Non-Autoregressive Neural Machine Translation

Difformer: Empowering Diffusion Models on the Embedding Space for Text Generation

SPINE: Structural Identity Preserved Inductive Network Embedding

SPINE: Structural Identity Preserved Inductive Network Embedding

Interface Microstructure and Mechanical Properties of Al/Steel Bimetallic Composites Fabricated by Liquid-Solid Casting with Rare Earth Eu Additions

Contact Info

Product

Resources

About