Neural Machine Translation with Source-Side Latent Graph Parsing

Hashimoto, Kazuma; Tsuruoka, Yoshimasa

doi:10.18653/v1/d17-1012

Cited by 47 publications

(47 citation statements)

References 30 publications

(47 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…As a consequence, most of the research uses dependency graph information as an external feature or carefully engineers more compact features extracted from the dependency tree arcs [24], [25]. On the other hand, some research adopts input latent graph parsing [33] as the syntax representation. Inducing the dependency tree in a principled manner while training allows the model to learn the internal representation of the sentence very well [31], [34].…”

Section: Structured Attentionmentioning

confidence: 99%

Identifying Protein-Protein Interaction Using Tree LSTM and Structured Attention

Ahmed

Islam

Samee

et al. 2019

2019 IEEE 13th International Conference on Semantic Computing (ICSC)

View full text Add to dashboard Cite

Identifying interactions between proteins is important to understand underlying biological processes. Extracting a protein-protein interaction (PPI) from the raw text is often very difficult. Previous supervised learning methods have used handcrafted features on human-annotated data sets. In this paper, we propose a novel tree recurrent neural network with structured attention architecture for doing PPI. Our architecture achieves state of the art results (precision, recall, and F1-score) on the AIMed and BioInfer benchmark data sets. Moreover, our models achieve a significant improvement over previous best models without any explicit feature extraction. Our experimental results show that traditional recurrent networks have inferior performance compared to tree recurrent networks for the supervised PPI problem.

show abstract

Section: Structured Attentionmentioning

confidence: 99%

Identifying Protein-Protein Interaction Using Tree LSTM and Structured Attention

Ahmed

Islam

Samee

et al. 2019

2019 IEEE 13th International Conference on Semantic Computing (ICSC)

View full text Add to dashboard Cite

show abstract

“…The probability for a packed d-length dependency chain is obtained from a dependency graph, which is an edge-factored dependency score matrix (Hashimoto and Tsuruoka, 2017;Zhang et al, 2017). First, we explain the dependency graph.…”

Section: Packed D-length Dependency Chainmentioning

confidence: 99%

“…Thus, these methods cannot track all possible parents for each word within the decoding process. Similar to HiSAN, Hashimoto and Tsuruoka (2017) use dependency features as attention distributions, but different from HiSAN, they use pre-trained dependency relations, and do not take into account the chains of dependencies. ; Bastings et al (2017) consider higherorder dependency relationships in Seq2Seq by incorporating a graph convolution technique (Kipf and Welling, 2016) into the encoder.…”

Section: Related Workmentioning

confidence: 99%

Higher-Order Syntactic Attention Network for Longer Sentence Compression

Kamigaito¹,

Hayashi²,

Hirao³

et al. 2018

Proceedings of the 2018 Conference of the North American Chapter Of the Association for Computational Linguistics: Hu

View full text Add to dashboard Cite

Sentence compression methods based on LSTM can generate fluent compressed sentences. However, the performance of these methods is significantly degraded when compressing long sentences since it does not explicitly handle syntactic features. To solve this problem, we propose a higher-order syntactic attention network (HiSAN) that can handle higher-order dependency features as an attention distribution on LSTM hidden states. Furthermore, to avoid the influence of incorrect parse results, we train HiSAN by maximizing the probability of a correct output together with the attention distribution. Experiments on the Google sentence compression dataset show that our method achieved the best performance in terms of F 1 as well as ROUGE-1,2 and L scores, 83.2, 82.9, 75.8 and 82.7, respectively. In subjective evaluations, HiSAN outperformed baseline methods in both readability and informativeness.

show abstract

“…However, in most existing NMT models, source sentences are treated as sequences where the syntactic knowledge is neglected. Some effort has been done to incorporate source syntax into NMT to enhance the attention model [Eriguchi et al, 2016b;Hashimoto and Tsuruoka, 2017;Sennrich and Haddow, 2016]. [Eriguchi et al, 2016b] proposed a tree-tosequence attentional NMT model where source-side parse tree was used and achieved promising improvement.…”

Section: Related Workmentioning

confidence: 99%

“…[Sennrich and Haddow, 2016] incorporated linguistic features to improve the NMT performance by appending feature vectors to word embeddings. [Hashimoto and Tsuruoka, 2017] proposed a multi-task framework to learn both source parsing and translation. Difference from previous syntax-based work, in this paper we focus on improve NMT encoder with sourceside long-distance word dependencies.…”

Section: Related Workmentioning

confidence: 99%

Improved Neural Machine Translation with Source Syntax

Zhou

Zhang

2017

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence

View full text Add to dashboard Cite

Neural Machine Translation (NMT) based on the encoder-decoder architecture has recently achieved the state-of-the-art performance. Researchers have proven that extending word level attention to phrase level attention by incorporating source-side phrase structure can enhance the attention model and achieve promising improvement. However, word dependencies that can be crucial to correctly understand a source sentence are not always in a consecutive fashion (i.e. phrase structure), sometimes they can be in long distance. Phrase structures are not the best way to explicitly model long distance dependencies. In this paper we propose a simple but effective method to incorporate source-side long distance dependencies into NMT. Our method based on dependency trees enriches each source state with global dependency structures, which can better capture the inherent syntactic structure of source sentences. Experiments on Chinese-English and English-Japanese translation tasks show that our proposed method outperforms state-of-the-art SMT and NMT baselines.

show abstract

Neural Machine Translation with Source-Side Latent Graph Parsing

Cited by 47 publications

References 30 publications

Identifying Protein-Protein Interaction Using Tree LSTM and Structured Attention

Identifying Protein-Protein Interaction Using Tree LSTM and Structured Attention

Higher-Order Syntactic Attention Network for Longer Sentence Compression

Improved Neural Machine Translation with Source Syntax

Contact Info

Product

Resources

About