Short Text Similarity with Word Embeddings

Kenter, Tom; Rijke, Maarten de

doi:10.1145/2806416.2806475

Cited by 355 publications

(252 citation statements)

References 23 publications

Supporting

Mentioning

240

Contrasting

Unclassified

Order By: Relevance

“…In this survey, we review neural models for textual similarity only if the model is evaluated for retrieval of similar textual units. Limiting to Similar Item Retrieval, we exclude works on neural models for general purpose textual similarity such as Hill et al (2016), Kenter and de Rijke (2015).…”

mentioning

confidence: 99%

Neural information retrieval: at the end of the early years

et al. 2017

Self Cite

View full text Add to dashboard Cite

A recent ''third wave'' of neural network (NN) approaches now delivers state-ofthe-art performance in many machine learning tasks, spanning speech recognition, computer vision, and natural language processing. Because these modern NNs often comprise multiple interconnected layers, work in this area is often referred to as deep learning. Recent years have witnessed an explosive growth of research into NN-based approaches to information retrieval (IR). A significant body of work has now been created. In this paper,

show abstract

mentioning

confidence: 99%

Neural information retrieval: at the end of the early years

et al. 2017

Self Cite

View full text Add to dashboard Cite

show abstract

“…First, using pre-trained word embeddings like combining traditional retrieval models with an embedding-based translation model [16,58], using pre-trained embeddings for query expansion to improve retrieval [57], and representing documents as Bag-of-Word-Embeddings (BoWE) [20,27]. Second, learning representations from scratch like learning representations of words and documents [28,32] and employing them in retrieval task [2,3], and learning representations in an end-to-end neural model for learning a speci c task like entity ranking for expert nding [53] or product search [52].…”

Section: Objectivesmentioning

confidence: 99%

Neural Networks for Information Retrieval

Kenter¹,

Borisov

Gysel

et al. 2018

Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining

Self Cite

View full text Add to dashboard Cite

Machine learning plays a role in many aspects of modern IR systems, and deep learning is applied in all of them. The fast pace of modernday research has given rise to many di erent approaches for many di erent IR problems. The amount of information available can be overwhelming both for junior students and for experienced researchers looking for new research topics and directions. Additionally, it is interesting to see what key insights into IR problems the new technologies are able to give us. The aim of this full-day tutorial is to give a clear overview of current tried-and-trusted neural methods in IR and how they bene t IR research. It covers key architectures, as well as the most promising future directions. MOTIVATIONPrompted by the advances of deep learning in computer vision research, neural networks have resurfaced as a popular machine learning paradigm in many other directions of research as well, including information retrieval. Recent years have seen neural networks being applied to all key parts of the typical modern IR pipeline, such core ranking algorithms [26,42,51], click models [9,10], knowledge graphs [8,35], text similarity [28,47], entity retrieval [52,53], language modeling [5], question answering [22,56], and dialogue systems [34,54].A key advantage that sets neural networks apart from many learning strategies employed earlier, is their ability to work from raw input data. E.g., when given enough training data, well-designed networks can become feature extractors themselves, e.g., incorporating basic input characteristics such as term frequency (tf) and term saliency (idf)-that used to be pre-calculated o ine-in their initial layers. Where designing features used to be a crucial aspect and contribution of newly proposed IR approaches, the focus * Corresponding author.SIGIR '17, Shinjuku, Tokyo, Japan has shifted to designing network architectures instead. As a consequence, many di erent architectures and paradigms have been proposed, such as auto-encoders, recursive networks, recurrent networks, convolutional networks, various embedding methods, deep reinforcement and deep q-learning, and, more recently, generative adversarial networks, of which most have been applied in IR settings. The aim of the neural networks for IR (NN4IR) tutorial is to provide a clear overview of the main network architectures currently applied in IR and to show explicitly how they relate to previous work. The tutorial covers methods applied in industry and academia, with in-depth insights into the underlying theory, core IR tasks, applicability, key assets and handicaps, scalability concerns and practical tips and tricks.We expect the tutorial to be useful both for academic and industrial researchers and practitioners who either want to develop new neural models, use them in their own research in other areas or apply the models described here to improve actual IR systems. OBJECTIVESThe material in the tutorial covers a broad range of IR applications. It is structured as follows:Preliminaries (60 minutes). The rece...

show abstract

“…Although research on natural language processing and text mining can be regarded as mature, it has largely focused on the assumption of well-written "long enough" documents [36]. Several methods for word and phrase similarity have been proposed in recent years [38,39], which aim at measuring similarity between text that may not contain any words in common. While not all these methods are directly applicable for social media, they represent promising approaches to identify related text content.…”

Section: Page 766mentioning

confidence: 99%

Data Quality Challenges in Twitter Content Analysis for Informing Policy Making in Health Care

Soto¹,

Ryan²,

Silva³

et al. 2018

Proceedings of the 51st Hawaii International Conference on System Sciences

View full text Add to dashboard Cite

Social media platforms and microblogs have become popular fora where the general public expresses opinions and concerns on a variety of matters. As a result, private and public organizations have been looking into ways for finding, understanding and communicating insights extracted from this massive amount of text-based interconnected data. There are, however, important difficulties associated with the noisiness and reliability of the content that hinder the analysis of the data. This paper reports the main challenges found in a real-world experience with social media used as a source of data to support policy making and assessment. We also propose a set of strategies for the precise retrieval of data, the profiling of social media users, and the involvement of policy makers in the analytical process.

show abstract

Short Text Similarity with Word Embeddings

Cited by 355 publications

References 23 publications

Neural information retrieval: at the end of the early years

Neural information retrieval: at the end of the early years

Neural Networks for Information Retrieval

Data Quality Challenges in Twitter Content Analysis for Informing Policy Making in Health Care

Contact Info

Product

Resources

About