Katsuki Chousa scite author profile

Katsuki Chousa

5Publications

36Citation Statements Received

100Citation Statements Given

How they've been cited

How they cite others

100

Affiliations

Publications

Order By: Most citations

A Supervised Word Alignment Method based on Cross-Language Span Prediction using Multilingual BERT

Nagata¹,

Chousa²,

Nishino³

2020

View full text Add to dashboard Cite

We present a novel supervised word alignment method based on cross-language span prediction. We first formalize a word alignment problem as a collection of independent predictions from a token in the source sentence to a span in the target sentence. Since this step is equivalent to a SQuAD v2.0 style question answering task, we solve it using the multilingual BERT, which is fine-tuned on manually created gold word alignment data. It is nontrivial to obtain accurate alignment from a set of independently predicted spans. We greatly improved the word alignment accuracy by adding to the question the source token's context and symmetrizing two directional predictions. In experiments using five word alignment datasets from among Chinese, Japanese, German, Romanian, French, and English, we show that our proposed method significantly outperformed previous supervised and unsupervised word alignment methods without any bitexts for pretraining. For example, we achieved 86.7 F1 score for the Chinese-English data, which is 13.3 points higher than the previous state-of-the-art supervised method. 1

show abstract

Incorporating Noisy Length Constraints into Transformer with Length-aware Positional Encodings

Oka

Chousa²,

Sudoh

et al. 2020

View full text Add to dashboard Cite

Neural Machine Translation often suffers from an under-translation problem due to its limited modeling of output sequence lengths. In this work, we propose a novel approach to training a Transformer model using length constraints based on length-aware positional encoding (PE). Since length constraints with exact target sentence lengths degrade translation performance, we add random noise within a certain window size to the length constraints in the PE during the training. In the inference step, we predict the output lengths using input sequences and a BERTbased length prediction model. Experimental results in an ASPEC English-to-Japanese translation showed the proposed method produced translations with lengths close to the reference ones and outperformed a vanilla Transformer by 3.22 points in BLEU on short sentences within ten subwords. The average translation results using our length prediction model were also better than another baseline method using input lengths for the length constraints. The proposed noise injection improved robustness for length prediction errors, especially within the window size.

show abstract

Training Neural Machine Translation using Word Embedding-based Loss

Chousa¹,

Sudoh²,

Nakamura³

2018

Preprint

View full text Add to dashboard Cite

In neural machine translation (NMT), the computational cost at the output layer increases with the size of the target-side vocabulary. Using a limited-size vocabulary instead may cause a significant decrease in translation quality. This trade-off is derived from a softmax-based loss function that handles in-dictionary words independently, in which word similarity is not considered. In this paper, we propose a novel NMT loss function that includes word similarity in forms of distances in a word embedding space. The proposed loss function encourages an NMT decoder to generate words close to their references in the embedding space; this helps the decoder to choose similar acceptable words when the actual best candidates are not included in the vocabulary due to its size limitation. In experiments using ASPEC Japaneseto-English and IWSLT17 English-to-French data sets, the proposed method showed improvements against a standard NMT baseline in both data sets; especially with IWSLT17 En-Fr, it achieved up to +1.72 in BLEU and +1.99 in METEOR. When the target-side vocabulary was very limited to 1,000 words, the proposed method demonstrated a substantial gain, +1.72 in METEOR with ASPEC Ja-En.

show abstract

SpanAlign: Sentence Alignment Method based on Cross-Language Span Prediction and ILP

Chousa¹,

Nagata

Nishino

2020

View full text Add to dashboard Cite

We propose a novel method of automatic sentence alignment from noisy parallel documents. We first formalize the sentence alignment problem as the independent predictions of spans in the target document from sentences in the source document. We then introduce a total optimization method using integer linear programming to prevent span overlapping and obtain non-monotonic alignments. We implement cross-language span prediction by fine-tuning pre-trained multilingual language models based on BERT architecture and train them using pseudo-labeled data obtained from unsupervised sentence alignment method. While the baseline methods use sentence embeddings and assume monotonic alignment, our method can capture the token-to-token interaction between the tokens of source and target text and handle non-monotonic alignments. In sentence alignment experiments on English-Japanese, our method achieved 70.3 F 1 scores, which are +8.0 points higher than the baseline method. In particular, our method improved by +53.9 F 1 scores for extracting non-parallel sentences. Our method improved the downstream machine translation accuracy by 4.1 BLEU scores when the extracted bilingual sentences are used for fine-tuning a pre-trained Japanese-to-English translation model. 1

show abstract

Bilingual Text Extraction as Reading Comprehension

Chousa¹,

Nagata²,

Nishino³

2020

Preprint

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Katsuki Chousa

A Supervised Word Alignment Method based on Cross-Language Span Prediction using Multilingual BERT

Incorporating Noisy Length Constraints into Transformer with Length-aware Positional Encodings

Training Neural Machine Translation using Word Embedding-based Loss

SpanAlign: Sentence Alignment Method based on Cross-Language Span Prediction and ILP

Bilingual Text Extraction as Reading Comprehension

Contact Info

Product

Resources

About