Noam Razin scite author profile

Noam Razin

6Publications

35Citation Statements Received

169Citation Statements Given

How they've been cited

How they cite others

168

Affiliations

Tel Aviv University

Publications

Order By: Most citations

Scalable Attentive Sentence Pair Modeling via Distilled Sentence Embedding

Barkan

Razin

Malkiel

et al. 2020

AAAI

View full text Add to dashboard Cite

Recent state-of-the-art natural language understanding models, such as BERT and XLNet, score a pair of sentences (A and B) using multiple cross-attention operations – a process in which each word in sentence A attends to all words in sentence B and vice versa. As a result, computing the similarity between a query sentence and a set of candidate sentences, requires the propagation of all query-candidate sentence-pairs throughout a stack of cross-attention layers. This exhaustive process becomes computationally prohibitive when the number of candidate sentences is large. In contrast, sentence embedding techniques learn a sentence-to-vector mapping and compute the similarity between the sentence vectors via simple elementary operations. In this paper, we introduce Distilled Sentence Embedding (DSE) – a model that is based on knowledge distillation from cross-attentive models, focusing on sentence-pair tasks. The outline of DSE is as follows: Given a cross-attentive teacher model (e.g. a fine-tuned BERT), we train a sentence embedding based student model to reconstruct the sentence-pair scores obtained by the teacher model. We empirically demonstrate the effectiveness of DSE on five GLUE sentence-pair tasks. DSE significantly outperforms several ELMO variants and other sentence embedding methods, while accelerating computation of the query-candidate sentence-pairs similarities by several orders of magnitude, with an average relative degradation of 4.6% compared to BERT. Furthermore, we show that DSE produces sentence embeddings that reach state-of-the-art performance on universal sentence representation benchmarks. Our code is made publicly available at https://github.com/microsoft/Distilled-Sentence-Embedding.

show abstract

RecoBERT: A Catalog Language Model for Text-Based Recommendations

Malkiel

Barkan

Caciularu

et al. 2020

View full text Add to dashboard Cite

Language models that utilize extensive selfsupervised pre-training from unlabeled text, have recently shown to significantly advance the state-of-the-art performance in a variety of language understanding tasks. However, it is yet unclear if and how these recent models can be harnessed for conducting text-based recommendations. In this work, we introduce RecoBERT, a BERT-based approach for learning catalog-specialized language models for text-based item recommendations. We suggest novel training and inference procedures for scoring similarities between pairs of items, that don't require item similarity labels. Both the training and the inference techniques were designed to utilize the unlabeled structure of textual catalogs, and minimize the discrepancy between them. By incorporating four scores during inference, RecoBERT can infer text-based item-to-item similarities more accurately than other techniques. In addition, we introduce a new language understanding task for wine recommendations using similarities based on professional wine reviews. As an additional contribution, we publish annotated recommendations dataset crafted by human wine experts. Finally, we evaluate Re-coBERT and compare it to various state-of-theart NLP models on wine and fashion recommendations tasks.

show abstract

RecoBERT: A Catalog Language Model for Text-Based Recommendations

Malkiel¹,

Barkan²,

Caciularu³

et al. 2020

Preprint

View full text Add to dashboard Cite

Scalable Attentive Sentence-Pair Modeling via Distilled Sentence Embedding

Barkan¹,

Razin²,

Malkiel³

et al. 2019

Preprint

View full text Add to dashboard Cite

Implicit Regularization in Hierarchical Tensor Factorization and Deep Convolutional Neural Networks

Razin¹,

Maman²,

Cohen³

2022

Preprint

View full text Add to dashboard Cite

What Makes Data Suitable for a Locally Connected Neural Network? A Necessary and Sufficient Condition Based on Quantum Entanglement

Yotam¹,

Vega²,

Razin³

et al. 2023

Preprint

View full text Add to dashboard Cite

The question of what makes a data distribution suitable for deep learning is a fundamental open problem. Focusing on locally connected neural networks (a prevalent family of architectures that includes convolutional and recurrent neural networks as well as local self-attention models), we address this problem by adopting theoretical tools from quantum physics. Our main theoretical result states that a certain locally connected neural network is capable of accurate prediction over a data distribution if and only if the data distribution admits low quantum entanglement under certain canonical partitions of features. As a practical application of this result, we derive a preprocessing method for enhancing the suitability of a data distribution to locally connected neural networks. Experiments with widespread models over various datasets demonstrate our findings. We hope that our use of quantum entanglement will encourage further adoption of tools from physics for formally reasoning about the relation between deep learning and real-world data.A seemingly distinct scientific discipline tying distributions and computational models is quantum physics. There, distributions of interest are described by tensors, and the associated computational * Equal contribution

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Noam Razin

Scalable Attentive Sentence Pair Modeling via Distilled Sentence Embedding

RecoBERT: A Catalog Language Model for Text-Based Recommendations

RecoBERT: A Catalog Language Model for Text-Based Recommendations

Scalable Attentive Sentence-Pair Modeling via Distilled Sentence Embedding

Implicit Regularization in Hierarchical Tensor Factorization and Deep Convolutional Neural Networks

What Makes Data Suitable for a Locally Connected Neural Network? A Necessary and Sufficient Condition Based on Quantum Entanglement

Contact Info

Product

Resources

About