Dario Stojanovski scite author profile

Using a language model (LM) pretrained on two languages with large monolingual data in order to initialize an unsupervised neural machine translation (UNMT) system yields stateof-the-art results. When limited data is available for one language, however, this method leads to poor translations. We present an effective approach that reuses an LM that is pretrained only on a high-resource language. The monolingual LM is fine-tuned on both languages and is then used to initialize a UNMT model. To reuse the pretrained LM, we have to modify its predefined vocabulary, to account for the new language. We therefore propose a novel vocabulary extension method. Our approach, RE-LM, outperforms a competitive cross-lingual pretraining model (XLM) in English-Macedonian (En-Mk) and English-Albanian (En-Sq), yielding more than +8.3 BLEU points for all four translation directions.

show abstract

Coreference and Coherence in Neural Machine Translation: A Study Using Oracle Experiments

Stojanovski¹,

Fraser²

2018

View full text Add to dashboard Cite

Cross-sentence context can provide valuable information in Machine Translation and is critical for translation of anaphoric pronouns and for providing consistent translations. In this paper, we devise simple oracle experiments targeting coreference and coherence. Oracles are an easy way to evaluate the effect of different discourse-level phenomena in NMT using BLEU and eliminate the necessity to manually define challenge sets for this purpose. We propose two context-aware NMT models and compare them against models working on a concatenation of consecutive sentences. Concatenation models perform better, but are computationally expensive. We show that NMT models taking advantage of context oracle signals can achieve considerable gains in BLEU, of up to 7.02 BLEU for coreference and 1.89 BLEU for coherence on subtitles translation. Access to strong signals allows us to make clear comparisons between context-aware models.

show abstract

Twitter Sentiment Analysis Using Deep Convolutional Neural Network

Stojanovski

Strezoski

Madjarov

et al. 2015

View full text Add to dashboard Cite

Finki at SemEval-2016 Task 4: Deep Learning Architecture for Twitter Sentiment Analysis

Stojanovski

Strezoski

Madjarov

et al. 2016

View full text Add to dashboard Cite

In this paper, we present a novel deep learning architecture for sentiment analysis in Twitter messages. Our system finki, employs both convolutional and gated recurrent neural networks to obtain a more diverse tweet representation. The network is trained on top of GloVe word embeddings pre-trained on the Common Crawl dataset. Both neural networks are used to obtain a fixed length representation of variable sized tweets, and the concatenation of these vectors is supplied to a fully connected softmax layer with dropout regularization. The system is evaluated on benchmark datasets from the Sentiment Analysis in Twitter task of the SemEval 2016 challenge where our model achieves best and second highest results on the 2-point and 5-point quantification subtasks respectively. Despite not relying on any hand-crafted features, our system manages the second highest average rank on the considered subtasks.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.