Sascha Rothe scite author profile

Unsupervised pre-training of large neural models has recently revolutionized Natural Language Processing. By warm-starting from the publicly released checkpoints, NLP practitioners have pushed the state-of-the-art on multiple benchmarks while saving significant amounts of compute time. So far the focus has been mainly on the Natural Language Understanding tasks. In this paper, we demonstrate the efficacy of pre-trained checkpoints for Sequence Generation. We developed a Transformer-based sequence-to-sequence model that is compatible with publicly available pre-trained BERT, GPT-2, and RoBERTa checkpoints and conducted an extensive empirical study on the utility of initializing our model, both encoder and decoder, with these checkpoints. Our models result in new state-of-the-art results on Machine Translation, Text Summarization, Sentence Splitting, and Sentence Fusion.

show abstract

AutoExtend: Extending Word Embeddings to Embeddings for Synsets and Lexemes

Rothe

Schütze

2015

209

215

View full text Add to dashboard Cite

We present AutoExtend, a system to learn embeddings for synsets and lexemes. It is flexible in that it can take any word embeddings as input and does not need an additional training corpus. The synset/lexeme embeddings obtained live in the same vector space as the word embeddings. A sparse tensor formalization guarantees efficiency and parallelizability. We use WordNet as a lexical resource, but AutoExtend can be easily applied to other resources like Freebase.AutoExtend achieves state-of-the-art performance on word similarity and word sense disambiguation tasks.

show abstract

Ultradense Word Embeddings by Orthogonal Transformation

Rothe

Ebert

Schütze

2016

View full text Add to dashboard Cite

Embeddings are generic representations that are useful for many NLP tasks. In this paper, we introduce DENSIFIER, a method that learns an orthogonal transformation of the embedding space that focuses the information relevant for a task in an ultradense subspace of a dimensionality that is smaller by a factor of 100 than the original space. We show that ultradense embeddings generated by DENSI-FIER reach state of the art on a lexicon creation task in which words are annotated with three types of lexical information -sentiment, concreteness and frequency. On the SemEval2015 10B sentiment analysis task we show that no information is lost when the ultradense subspace is used, but training is an order of magnitude more efficient due to the compactness of the ultradense space.

show abstract

Encode, Tag, Realize: High-Precision Text Editing

Malmi¹,

Krause²,

Rothe³

et al. 2019

116

View full text Add to dashboard Cite

We propose LASERTAGGER-a sequence tagging approach that casts text generation as a text editing task. Target texts are reconstructed from the inputs using three main edit operations: keeping a token, deleting it, and adding a phrase before the token. To predict the edit operations, we propose a novel model, which combines a BERT encoder with an autoregressive Transformer decoder. This approach is evaluated on English text on four tasks: sentence fusion, sentence splitting, abstractive summarization, and grammar correction. LASERTAGGER achieves new state-ofthe-art results on three of these tasks, performs comparably to a set of strong seq2seq baselines with a large number of training examples, and outperforms them when the number of examples is limited. Furthermore, we show that at inference time tagging can be more than two orders of magnitude faster than comparable seq2seq models, making it more attractive for running in a live environment.

show abstract

Learning to Attend, Copy, and Generate for Session-Based Query Suggestion

et al. 2017

View full text Add to dashboard Cite

Users try to articulate their complex information needs during search sessions by reformulating their queries. To make this process more e ective, search engines provide related queries to help users in specifying the information need in their search process. In this paper we propose a customized sequence-to-sequence model for sessionbased query suggestion. In our model, we employ a query-aware a ention mechanism to capture the structure of the session context. is enables us to control the scope of the session from which we infer the suggested next query, which helps not only handle the noisy data but also automatically detect session boundaries. Furthermore we observe that, based on the user query reformulation behavior, within a single session a large portion of query terms is retained from the previously submi ed queries and consists of mostly infrequent or unseen terms that are usually not included in the vocabulary. We therefore empower the decoder of our model to access the source words from the session context during decoding by incorporating a copy mechanism. Moreover, we propose evaluation metrics to assess the quality of the generative models for query suggestion. We conduct an extensive set of experiments and analysis. e results suggest that our model outperforms the baselines both in terms of the generating queries and scoring candidate queries for the task of query suggestion.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Sascha Rothe

Leveraging Pre-trained Checkpoints for Sequence Generation Tasks

AutoExtend: Extending Word Embeddings to Embeddings for Synsets and Lexemes

Ultradense Word Embeddings by Orthogonal Transformation

Encode, Tag, Realize: High-Precision Text Editing

Learning to Attend, Copy, and Generate for Session-Based Query Suggestion

Contact Info

Product

Resources

About