Untitled

We investigate the following question for machine translation (MT): can we develop a single universal MT model to serve as the common seed and obtain derivative and improved models on arbitrary language pairs? We propose mRASP, an approach to pre-train a universal multilingual neural machine translation model. Our key idea in mRASP is its novel technique of random aligned substitution, which brings words and phrases with similar meanings across multiple languages closer in the representation space. We pre-train a mRASP model on 32 language pairs jointly with only public datasets. The model is then fine-tuned on downstream language pairs to obtain specialized MT models. We carry out extensive experiments on 42 translation directions across a diverse settings, including low, medium, rich resource, and as well as transferring to exotic language pairs. Experimental results demonstrate that mRASP achieves significant performance improvement compared to directly training on those target pairs. It is the first time to verify that multiple lowresource language pairs can be utilized to improve rich resource MT. Surprisingly, mRASP is even able to improve the translation quality on exotic languages that never occur in the pretraining corpus. Code, data, and pre-trained models are available at https://github. com/linzehui/mRASP. * Equal contribution. The work was done when the first author was an intern at ByteDance.

show abstract

Contrastive Learning for Many-to-many Multilingual Neural Machine Translation

Pan¹,

Wang²,

Wu³

et al. 2021

View full text Add to dashboard Cite

Existing multilingual machine translation approaches mainly focus on English-centric directions, while the non-English directions still lag behind. In this work, we aim to build a many-to-many translation system with an emphasis on the quality of non-English language directions. Our intuition is based on the hypothesis that a universal cross-language representation leads to better multilingual translation performance. To this end, we propose mRASP2, a training method to obtain a single unified multilingual translation model. mRASP2 is empowered by two techniques: a) a contrastive learning scheme to close the gap among representations of different languages, and b) data augmentation on both multiple parallel and monolingual data to further align token representations. For English-centric directions, mRASP2 outperforms existing best unified model and achieves competitive or even better performance than the pre-trained and fine-tuned model mBART on tens of WMT's translation directions. For non-English directions, mRASP2 achieves an improvement of average 10+ BLEU compared with the multilingual Transformer baseline. Code, data and trained models are available at https://github. com/PANXiao1994/mRASP2.

show abstract

Contrastive Learning for Many-to-many Multilingual Neural Machine Translation

Pan¹,

Wang

Wu³

et al. 2021

Preprint

View full text Add to dashboard Cite

Existing multilingual machine translation approaches mainly focus on English-centric directions, while the non-English directions still lag behind. In this work, we aim to build a many-to-many translation system with an emphasis on the quality of non-English language directions. Our intuition is based on the hypothesis that a universal cross-language representation leads to better multilingual translation performance. To this end, we propose mCOLT, a training method to obtain a single unified multilingual translation model. mCOLT is empowered by two techniques: a) a contrastive learning scheme to close the gap among representations of different languages, and b) data augmentation on both multiple parallel and monolingual data to further align token representations. For English-centric directions, mCOLT achieves competitive or even better performance than a strong pre-trained model mBART on tens of WMT benchmarks. For non-English directions, mCOLT achieves an improvement of average 10+ BLEU compared with the multilingual baseline 1 .

show abstract

Pre-training Multilingual Neural Machine Translation by Leveraging Alignment Information

Lin¹,

Pan²,

Wang

et al. 2020

Preprint

View full text Add to dashboard Cite

show abstract

Chinese-Naxi machine translation method based on Naxi dependency language model

Gao

Yang

et al. 2015

Int. J. Mach. Learn. & Cyber.

View full text Add to dashboard Cite

Using deep learning to distinguish malignant from benign parotid tumors on plain computed tomography images

Wang

Pan

et al. 2022

Front. Oncol.

View full text Add to dashboard Cite

ObjectivesEvaluating the diagnostic efficiency of deep-learning models to distinguish malignant from benign parotid tumors on plain computed tomography (CT) images.Materials and methodsThe CT images of 283 patients with parotid tumors were enrolled and analyzed retrospectively. Of them, 150 were benign and 133 were malignant according to pathology results. A total of 917 regions of interest of parotid tumors were cropped (456 benign and 461 malignant). Three deep-learning networks (ResNet50, VGG16_bn, and DenseNet169) were used for diagnosis (approximately 3:1 for training and testing). The diagnostic efficiencies (accuracy, sensitivity, specificity, and area under the curve [AUC]) of three networks were calculated and compared based on the 917 images. To simulate the process of human diagnosis, a voting model was developed at the end of the networks and the 283 tumors were classified as benign or malignant. Meanwhile, 917 tumor images were classified by two radiologists (A and B) and original CT images were classified by radiologist B. The diagnostic efficiencies of the three deep-learning network models (after voting) and the two radiologists were calculated.ResultsFor the 917 CT images, ResNet50 presented high accuracy and sensitivity for diagnosing malignant parotid tumors; the accuracy, sensitivity, specificity, and AUC were 90.8%, 91.3%, 90.4%, and 0.96, respectively. For the 283 tumors, the accuracy, sensitivity, and specificity of ResNet50 (after voting) were 92.3%, 93.5% and 91.2%, respectively.ConclusionResNet50 presented high sensitivity in distinguishing malignant from benign parotid tumors on plain CT images; this made it a promising auxiliary diagnostic method to screen malignant parotid tumors.

show abstract

In-N-Out Generative Learning for Dense Unsupervised Video Segmentation

Pan

Peike

Yang

et al. 2022

View full text Add to dashboard Cite

Masked Audio Text Encoders are Effective Multi-Modal Rescorers

Jinglun¹,

Sunkara²,

Li³

et al. 2023

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Xiao Pan

Pre-training Multilingual Neural Machine Translation by Leveraging Alignment Information

Contrastive Learning for Many-to-many Multilingual Neural Machine Translation

Contrastive Learning for Many-to-many Multilingual Neural Machine Translation

Pre-training Multilingual Neural Machine Translation by Leveraging Alignment Information

Chinese-Naxi machine translation method based on Naxi dependency language model

Using deep learning to distinguish malignant from benign parotid tumors on plain computed tomography images

In-N-Out Generative Learning for Dense Unsupervised Video Segmentation

Masked Audio Text Encoders are Effective Multi-Modal Rescorers

Contact Info

Product

Resources

About