Alexander Fraser scite author profile

Automatic word alignment plays a critical role in statistical machine translation. Unfortunately, the relationship between alignment quality and statistical machine translation performance has not been well understood. In the recent literature, the alignment task has frequently been decoupled from the translation task and assumptions have been made about measuring alignment quality for machine translation which, it turns out, are not justified. In particular, none of the tens of papers published over the last five years has shown that significant decreases in alignment error rate (AER) result in significant increases in translation performance. This paper explains this state of affairs and presents steps towards measuring alignment quality in a way which is predictive of statistical machine translation performance.

show abstract

Joint Lemmatization and Morphological Tagging with Lemming

Müller¹,

Cotterell²,

Fraser³

et al. 2015

View full text Add to dashboard Cite

We present LEMMING, a modular loglinear model that jointly models lemmatization and tagging and supports the integration of arbitrary global features. It is trainable on corpora annotated with gold standard tags and lemmata and does not rely on morphological dictionaries or analyzers. LEMMING sets the new state of the art in token-based statistical lemmatization on six languages; e.g., for Czech lemmatization, we reduce the error by 60%, from 4.05 to 1.58. We also give empirical evidence that jointly modeling morphological tags and lemmata is mutually beneficial.

show abstract

Perception of illusory movement

Fraser

Wilcox

1979

Nature

View full text Add to dashboard Cite

Intensive studies of visual illusion have rarely shown examples of polymorphic responses. We show here that, using figures consisting of stripes shaded from dark to light, arranged in repeating sectors, an illusion of movement can be induced in about 75% of observers when viewed peripherally. The responses of the viewers fall into four categories. This polymorphic response suggests a genetic origin.

show abstract

On the Language Neutrality of Pre-trained Multilingual Representations

Libovický¹,

Rosa²,

Fraser³

2020

View full text Add to dashboard Cite

Multilingual contextual embeddings, such as multilingual BERT and XLM-RoBERTa, have proved useful for many multi-lingual tasks. Previous work probed the cross-linguality of the representations indirectly using zero-shot transfer learning on morphological and syntactic tasks. We instead investigate the languageneutrality of multilingual contextual embeddings directly and with respect to lexical semantics. Our results show that contextual embeddings are more language-neutral and, in general, more informative than aligned static word-type embeddings, which are explicitly trained for language neutrality. Contextual embeddings are still only moderately languageneutral by default, so we propose two simple methods for achieving stronger language neutrality: first, by unsupervised centering of the representation for each language and second, by fitting an explicit projection on small parallel data. Besides, we show how to reach stateof-the-art accuracy on language identification and match the performance of statistical methods for word alignment of parallel sentences without using parallel data.

show abstract

Empirical studies in strategies for Arabic retrieval

Xu¹,

Fraser

Weischedel³

2002

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.