Nils Reimers scite author profile

BERT (Devlin et al., 2018) and RoBERTa (Liu et al., 2019) has set a new state-of-the-art performance on sentence-pair regression tasks like semantic textual similarity (STS). However, it requires that both sentences are fed into the network, which causes a massive computational overhead: Finding the most similar pair in a collection of 10,000 sentences requires about 50 million inference computations (~65 hours) with BERT. The construction of BERT makes it unsuitable for semantic similarity search as well as for unsupervised tasks like clustering.In this publication, we present Sentence-BERT (SBERT), a modification of the pretrained BERT network that use siamese and triplet network structures to derive semantically meaningful sentence embeddings that can be compared using cosine-similarity. This reduces the effort for finding the most similar pair from 65 hours with BERT / RoBERTa to about 5 seconds with SBERT, while maintaining the accuracy from BERT.We evaluate SBERT and SRoBERTa on common STS tasks and transfer learning tasks, where it outperforms other state-of-the-art sentence embeddings methods. 1

show abstract

Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks

Reimers¹,

Gurevych²

2019

Preprint

631

663

View full text Add to dashboard Cite

Combined analysis of HPV‐DNA, p16 and EGFR expression to predict prognosis in oropharyngeal cancer

Reimers

Kasper

Weissenborn

et al. 2007

Intl Journal of Cancer

386

363

View full text Add to dashboard Cite

Molecular prognostic indicators for oropharyngeal squamous cell carcinoma (OSCC), including HPV-DNA detection, epidermal growth factor receptor (EGFR) and p16 expression, have been suggested in the literature, but none of these are currently used in clinical practice. To compare these predictors, 106 newly diagnosed OSCC for the presence of HPV-DNA and expression of p16 and EGFR were analyzed. The 5-year disease-free survival (DFS) and overall survival (OS) were calculated in relation to these markers and a multivariate Cox analysis was performed. Twentyeight percent of the cases contained oncogenic HPV-DNA and 30% were positive for p16. The p16 expression was highly correlated with the presence of HPV-DNA (p < 0.001). Univariate analysis of the 5-year DFS revealed a significantly better outcome for patients with p16-positive tumors (84% vs. 49%, p 5 0.009). EGFR-negative tumors showed a tendency toward a better prognosis in DFS (74% vs. 47%, p 5 0.084) and OS (70% vs. 45%, p 5 0.100). Remarkable and highly significant was the combination of p16 and EGFR expression status, leading to 5-year DFS of 93% for p161/EGFR2 tumors vs. 39% for p162/EGFR1 tumors (p 5 0.003) and to a 5-year OS of 79% vs. 38%, respectively (p 5 0.010). In multivariate analysis p16 remained a highly significant prognostic marker for DFS (p 5 0.030) showing a 7.5-fold increased risk for relapse in patients with p16-negative tumors. Our data indicate that p16 expression is the most reliable prognostic marker for OSCC and further might be a surrogate marker for HPV-positive OSCC. HPV1/p161 tumors tended to have decreased EGFR expression, but using both immunohistological markers has significant prognostic implications. ' 2007 Wiley-Liss, Inc.

show abstract

Reporting Score Distributions Makes a Difference: Performance Study of LSTM-networks for Sequence Tagging

2017

View full text Add to dashboard Cite

In this paper we show that reporting a single performance score is insufficient to compare non-deterministic approaches. We demonstrate for common sequence tagging tasks that the seed value for the random number generator can result in statistically significant (p < 10 −4 ) differences for state-of-the-art systems. For two recent systems for NER, we observe an absolute difference of one percentage point F 1 -score depending on the selected seed value, making these systems perceived either as state-of-the-art or mediocre. Instead of publishing and reporting single performance scores, we propose to compare score distributions based on multiple executions. Based on the evaluation of 50.000 LSTMnetworks for five sequence tagging tasks, we present network architectures that produce both superior performance as well as are more stable with respect to the remaining hyperparameters. The full experimental results are published in (Reimers and Gurevych, 2017).

show abstract

Making Monolingual Sentence Embeddings Multilingual using Knowledge Distillation

2020

View full text Add to dashboard Cite

We present an easy and efficient method to extend existing sentence embedding models to new languages. This allows to create multilingual versions from previously monolingual models. The training is based on the idea that a translated sentence should be mapped to the same location in the vector space as the original sentence. We use the original (monolingual) model to generate sentence embeddings for the source language and then train a new system on translated sentences to mimic the original model. Compared to other methods for training multilingual sentence embeddings, this approach has several advantages: It is easy to extend existing models with relatively few samples to new languages, it is easier to ensure desired properties for the vector space, and the hardware requirements for training are lower. We demonstrate the effectiveness of our approach for 50+ languages from various language families. Code to extend sentence embeddings models to more than 400 languages is publicly available. 1

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Nils Reimers

Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks

Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks

Combined analysis of HPV‐DNA, p16 and EGFR expression to predict prognosis in oropharyngeal cancer

Reporting Score Distributions Makes a Difference: Performance Study of LSTM-networks for Sequence Tagging

Making Monolingual Sentence Embeddings Multilingual using Knowledge Distillation

Contact Info

Product

Resources

About