Nikhil Kumar Lakumarapu scite author profile

Nikhil Kumar Lakumarapu

8Publications

50Citation Statements Received

87Citation Statements Given

How they've been cited

How they cite others

118

Affiliations

Samsung (South Korea)

Publications

Order By: Most citations

End-end Speech-to-Text Translation with Modality Agnostic Meta-Learning

Indurthi

Han

Lakumarapu

et al. 2020

View full text Add to dashboard Cite

End-to-end Speech Translation (ST) models have several advantages such as lower latency, smaller model size, and less error compounding over conventional pipelines that combine Automatic Speech Recognition (ASR) and text Machine Translation (MT) models. However, collecting large amounts of parallel data for ST task is more difficult compared to the ASR and MT tasks. Previous studies have proposed the use of transfer learning approaches to overcome the above difficulty. These approaches benefit from weakly supervised training data, such as ASR speech-to-transcript or MT textto-text translation pairs. However, the parameters in these models are updated independently of each task, which may lead to sub-optimal solutions. In this work, we adopt a metalearning algorithm to train a modality agnostic multi-task model that transfers knowledge from source tasks=ASR+MT to target task=ST where ST task severely lacks data. In the meta-learning phase, the parameters of the model are exposed to vast amounts of speech transcripts (e.g., English ASR) and text translations (e.g., English-German MT). During this phase, parameters are updated in such a way to understand speech, text representations, the relation between them, as well as act as a good initialization point for the target ST task. We evaluate the proposed meta-learning approach for ST tasks on English-German (En-De) and English-French (En-Fr) language pairs from the Multilingual Speech Translation Corpus (MuST-C). Our method outperforms the previous transfer learning approaches and sets new state-of-the-art results for En-De and En-Fr ST tasks by obtaining 9.18, and 11.76 BLEU point improvements, respectively.

show abstract

End-to-End Simultaneous Translation System for IWSLT2020 Using Modality Agnostic Meta-Learning

Han¹,

Zaidi²,

Indurthi³

et al. 2020

View full text Add to dashboard Cite

In this paper, we describe end-to-end simultaneous speech-to-text and text-to-text translation systems submitted to IWSLT2020 online translation challenge. The systems are built by adding wait-k and meta-learning approaches to the Transformer architecture. The systems are evaluated on different latency regimes. The simultaneous text-to-text translation achieved a BLEU score of 26.38 compared to the competition baseline score of 14.17 on the low latency regime (Average latency ≤ 3). The simultaneous speech-to-text system improves the BLEU score by 7.7 points over the competition baseline for the low latency regime (Average Latency ≤ 1000).

show abstract

Task Aware Multi-Task Learning for Speech to Text Tasks

Indurthi

Zaidi

Lakumarapu

et al. 2021

View full text Add to dashboard Cite

In general, the direct Speech-to-text translation (ST) is jointly trained with Automatic Speech Recognition (ASR), and Machine Translation (MT) tasks. However, the issues with the current joint learning strategies inhibit the knowledge transfer across these tasks. We propose a task modulation network which allows the model to learn task specific features, while learning the shared features simultaneously. This proposed approach removes the need for separate finetuning step resulting in a single model which performs all these tasks. This single model achieves a performance of 28.64 BLEU score on ST MuST-C English-German, WER of 11.61% on ASR TEDLium v3, 23.35 BLEU score on MT WMT'15 English-German task. This sets a new state-of-the-art performance (SOTA) on the ST task while outperforming the existing end-to-end ASR systems.

show abstract

Faster Re-translation Using Non-Autoregressive Model For Simultaneous Neural Machine Translation

Han¹,

Indurthi²,

Zaidi³

et al. 2020

Preprint

View full text Add to dashboard Cite

Infusing Future Information into Monotonic Attention Through Language Models

Zaidi¹,

Indurthi²,

Lee³

et al. 2021

Preprint

View full text Add to dashboard Cite

Language Model Augmented Monotonic Attention for Simultaneous Translation

Indurthi¹,

Zaidi²,

Lee³

et al. 2022

View full text Add to dashboard Cite

The state-of-the-art adaptive policies for simultaneous neural machine translation (SNMT) use monotonic attention to perform read/write decisions based on the partial source and target sequences. The lack of sufficient information might cause the monotonic attention to take poor read/write decisions, which in turn negatively affects the performance of the SNMT model. On the other hand, human translators make better read/write decisions since they can anticipate the immediate future words using linguistic information and domain knowledge. In this work, we propose a framework to aid monotonic attention with an external language model to improve its decisions. Experiments on MuST-C English-German and English-French speech-to-text translation tasks show the future information from language model improves the state-of-the-art monotonic multi-head attention model further.

show abstract

Decision Attentive Regularization to Improve Simultaneous Speech Translation Systems

Zaidi¹,

Lee²,

Lakumarapu³

et al. 2021

Preprint

View full text Add to dashboard Cite

End-to-End Offline Speech Translation System for IWSLT 2020 using Modality Agnostic Meta-Learning

Lakumarapu¹,

Lee²,

Indurthi³

et al. 2020

View full text Add to dashboard Cite

In this paper, we describe the system submitted to the IWSLT 2020 Offline Speech Translation Task. We adopt the Transformer architecture coupled with the meta-learning approach to build our end-to-end Speechto-Text Translation (ST) system.Our meta-learning approach tackles the data scarcity of the ST task by leveraging the data available from Automatic Speech Recognition (ASR) and Machine Translation (MT) tasks. The meta-learning approach combined with synthetic data augmentation techniques improves the model performance significantly and achieves BLEU scores of 24.58, 27.51, and 27.61 on IWSLT test 2015, MuST-C test, and Europarl-ST test sets respectively.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.