Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Confere 2015
DOI: 10.3115/v1/p15-1004
|View full text |Cite
|
Sign up to set email alerts
|

Statistical Machine Translation Features with Multitask Tensor Networks

Abstract: We present a three-pronged approach to improving Statistical Machine Translation (SMT), building on recent success in the application of neural networks to SMT. First, we propose new features based on neural networks to model various nonlocal translation phenomena. Second, we augment the architecture of the neural network with tensor layers that capture important higher-order interaction among the network units. Third, we apply multitask learning to estimate the neural network parameters jointly. Each of our p… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2015
2015
2023
2023

Publication Types

Select...
4
4
1

Relationship

0
9

Authors

Journals

citations
Cited by 10 publications
(5 citation statements)
references
References 22 publications
0
5
0
Order By: Relevance
“…Tensors are powerful because they can capture important higher order interactions across time, feature dimensions, and multiple modalities (Kossaifi et al, 2017). For unimodal tasks, tensors have been used for part-of-speech tagging (Srikumar and Manning, 2014), dependency parsing (Lei et al, 2014), word segmentation (Pei et al, 2014), question answering (Qiu and Huang, 2015), and machine translation (Setiawan et al, 2015). For multimodal tasks, Huang et al (2017) used tensor products between images and text features for image captioning.…”
Section: Related Workmentioning
confidence: 99%
“…Tensors are powerful because they can capture important higher order interactions across time, feature dimensions, and multiple modalities (Kossaifi et al, 2017). For unimodal tasks, tensors have been used for part-of-speech tagging (Srikumar and Manning, 2014), dependency parsing (Lei et al, 2014), word segmentation (Pei et al, 2014), question answering (Qiu and Huang, 2015), and machine translation (Setiawan et al, 2015). For multimodal tasks, Huang et al (2017) used tensor products between images and text features for image captioning.…”
Section: Related Workmentioning
confidence: 99%
“…C , Tucker and TT decompositions have been leveraged in the context of neural networks [56,59,79,96,97,120,131,156,159], with the weight matrix of a fully-connected layer or a convolutional layer stored compressedly in a low-rank tensor, thus reducing redundancies in the network parameterization. As concerns improving theoretical aspects and understanding of deep neural networks through tensors, Cohen et al [29] analyzed the expressive power of deep architectures by drawing analogies between shallow networks and the rank-1 C , as well as between deep networks and the hTucker decomposition.…”
Section: Machinementioning
confidence: 99%
“…These models can be trained on parallel corpora and do not need word alignments to be learned in advance. There are also neural translation models that are trained on word-aligned parallel corpus (Devlin et al, 2014;Meng et al, 2015;Zhang et al, 2015;Setiawan et al, 2015), which use the alignment information to decide which parts of the source sentence are more important for predicting one particular target word. All these models are trained on plain source and target sentences without considering any syntactic information while our neural model learns rule selection for tree-based translation rules and makes use of the tree structure of natural language for better translation.…”
Section: Related Workmentioning
confidence: 99%