Siamese CBOW: Optimizing Word Embeddings for Sentence Representations

Kenter, Tom; Borisov, Alexey; Rijke, Maarten de

doi:10.18653/v1/p16-1089

Cited by 178 publications

(117 citation statements)

References 19 publications

Supporting

Mentioning

110

Contrasting

Order By: Relevance

“…Following the Tf-Idf weighting schema, another compositional way for building document representations has been introduced by [23], allowing to better fit with matching tasks. A more complex approach is inspired by neural language models [10,11]. Following the CBOW and the skip-gram frameworks [15] respectively, the Siamese CBOW model [10] and the Skip-thought [11] learn sentence representations by either predicting a sentence from its surrounding sentences or its context sentences from the encoded sentence.…”

Section: Traditional Neural Approaches For Learning Textmentioning

confidence: 99%

“…A more complex approach is inspired by neural language models [10,11]. Following the CBOW and the skip-gram frameworks [15] respectively, the Siamese CBOW model [10] and the Skip-thought [11] learn sentence representations by either predicting a sentence from its surrounding sentences or its context sentences from the encoded sentence. As an extension of word2vec, the Paragraph-Vector model [12] jointly learns paragraph (or document) and word representations within the same embedding space.…”

Section: Traditional Neural Approaches For Learning Textmentioning

confidence: 99%

See 1 more Smart Citation

A Tri-Partite Neural Document Language Model for Semantic Information Retrieval

et al. 2018

View full text Add to dashboard Cite

Abstract. Previous work in information retrieval have shown that using evidence, such as concepts and relations, from external knowledge resources could enhance the retrieval performance. Recently, deep neural approaches have emerged as state-of-the art models for capturing word semantics that can also be efficiently injected in IR models. This paper presents a new tri-partite neural document language framework that leverages explicit knowledge to jointly constrain word, concept, and document learning representations to tackle a number of issues including polysemy and granularity mismatch. We show the effectiveness of the framework in various IR tasks including document similarity, document re-ranking, and query expansion.

show abstract

Section: Traditional Neural Approaches For Learning Textmentioning

confidence: 99%

Section: Traditional Neural Approaches For Learning Textmentioning

confidence: 99%

A Tri-Partite Neural Document Language Model for Semantic Information Retrieval

et al. 2018

View full text Add to dashboard Cite

show abstract

“…Recent years have seen neural networks being applied to all key parts of the typical modern IR pipeline, such core ranking algorithms [26,42,51], click models [9,10], knowledge graphs [8,35], text similarity [28,47], entity retrieval [52,53], language modeling [5], question answering [22,56], and dialogue systems [34,54].…”

Section: Motivationmentioning

confidence: 99%

Neural Networks for Information Retrieval

Kenter¹,

Borisov

Gysel

et al. 2018

Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining

Self Cite

View full text Add to dashboard Cite

Machine learning plays a role in many aspects of modern IR systems, and deep learning is applied in all of them. The fast pace of modernday research has given rise to many di erent approaches for many di erent IR problems. The amount of information available can be overwhelming both for junior students and for experienced researchers looking for new research topics and directions. Additionally, it is interesting to see what key insights into IR problems the new technologies are able to give us. The aim of this full-day tutorial is to give a clear overview of current tried-and-trusted neural methods in IR and how they bene t IR research. It covers key architectures, as well as the most promising future directions. MOTIVATIONPrompted by the advances of deep learning in computer vision research, neural networks have resurfaced as a popular machine learning paradigm in many other directions of research as well, including information retrieval. Recent years have seen neural networks being applied to all key parts of the typical modern IR pipeline, such core ranking algorithms [26,42,51], click models [9,10], knowledge graphs [8,35], text similarity [28,47], entity retrieval [52,53], language modeling [5], question answering [22,56], and dialogue systems [34,54].A key advantage that sets neural networks apart from many learning strategies employed earlier, is their ability to work from raw input data. E.g., when given enough training data, well-designed networks can become feature extractors themselves, e.g., incorporating basic input characteristics such as term frequency (tf) and term saliency (idf)-that used to be pre-calculated o ine-in their initial layers. Where designing features used to be a crucial aspect and contribution of newly proposed IR approaches, the focus * Corresponding author.SIGIR '17, Shinjuku, Tokyo, Japan has shifted to designing network architectures instead. As a consequence, many di erent architectures and paradigms have been proposed, such as auto-encoders, recursive networks, recurrent networks, convolutional networks, various embedding methods, deep reinforcement and deep q-learning, and, more recently, generative adversarial networks, of which most have been applied in IR settings. The aim of the neural networks for IR (NN4IR) tutorial is to provide a clear overview of the main network architectures currently applied in IR and to show explicitly how they relate to previous work. The tutorial covers methods applied in industry and academia, with in-depth insights into the underlying theory, core IR tasks, applicability, key assets and handicaps, scalability concerns and practical tips and tricks.We expect the tutorial to be useful both for academic and industrial researchers and practitioners who either want to develop new neural models, use them in their own research in other areas or apply the models described here to improve actual IR systems. OBJECTIVESThe material in the tutorial covers a broad range of IR applications. It is structured as follows:Preliminaries (60 minutes). The rece...

show abstract

“…In recent years, there have also been several studies that extend the proportion from word level to sentence, paragraph, or even document level, such as doc2vec (Mikolov et al, 2013), FastText (Bojanowski et al, 2017, and Siamese-CBOW (Kenter et al, 2016). Following the fruitful progress of these techniques of word and sentence embeddings, this paper presents a web-based information system, RiskFinder, that broadens the content analysis from the word level to sentence level for financial reports.…”

Section: Introductionmentioning

confidence: 99%

“…In addition to the 10-K corpus, we also construct a set of labeled financial sentences with respect to financial risk by involving 8 financial specialists including accountants and financial analysts to ensure the quality of the labeling. With the labeled sentences and the large collection of financial reports, we apply FastText (Bojanowski et al, 2017) and Siamese-CBOW (Kenter et al, 2016) to sentence-level textual analysis. Due to the superior performance of FastText, the system highlights high risk sentences in those reports via using FastText.…”

Section: Introductionmentioning

confidence: 99%

Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations

Saphra¹,

Lopez²

2018

View full text Add to dashboard Cite

This paper presents NLP Lean Programming framework (NLPf), a new framework for creating custom natural language processing (NLP) models and pipelines by utilizing common software development build systems. This approach allows developers to train and integrate domain-specific NLP pipelines into their applications seamlessly. Additionally, NLPf provides an annotation tool which improves the annotation process significantly by providing a well-designed GUI and sophisticated way of using input devices. Due to NLPf's properties developers and domain experts are able to build domain-specific NLP applications more efficiently. NLPf is Opensource software and available at https:// gitlab.com/schrieveslaach/NLPf. IntroductionNowadays more and more business models rely on the processing of natural language data, e. g. companies extract relevant eCommerce data from domain-specific documents. The required eCommerce data could be related to various domains, e. g. life-science, public utilities, or social media, depending on the companies' business models.Furthermore, the World Wide Web (WWW) provides a huge amount of natural language data that provides a wide variety of knowledge to human readers. This amount of knowledge is unmanageable for humans and applications try to make this knowledge more accessible to humans, e. g. Treude and Robillard (2016) make natural language text about software programming more accessible through a natural language processing (NLP) application.All these approaches have in common that they require domain-specific NLP models that have been trained on a domain-specific and annotated corpus. These models will be trained by using different NLP frameworks and these models have to be evaluated for every annotation layer. For example, named entity recognition (NER) of Stanford CoreNLP (Manning et al., 2014) might work better than NER of OpenNLP (Reese, 2015, Chapter 1); the chosen segmentation tool, e. g. UDPipe (Straka and Straková, 2017), might work better than Stanford CoreNLP's segmentation tool, and so on. Existing studies show that domain specific training and evaluation is a common approach in the NLP community to determine the best-performing NLP pipeline (Buyko et al., 2006; Giesbrecht and Evert, 2009; Neunerdt et al., 2013; Omran and Treude, 2017).Developers of NLP applications are forced to create domain-specific corpora to determine the best-performing NLP pipeline among many NLP frameworks. During this process they face various obstacles:• The training and evaluation of different NLP frameworks requires a lot of effort of scripting or programming because of incompatible APIs.• Domain experts who annotate domainspecific documents with a GUI tool struggle with an insufficient user experience.• There are too many combinations how developers can combine these NLP tools into NLP pipelines.• The generated NLP models as a build artifact have to be integrated manually into the application code.NLP Lean Programming framework (NLPf) addresses these issues. NLPf provides a standardized p...

show abstract

Siamese CBOW: Optimizing Word Embeddings for Sentence Representations

Cited by 178 publications

References 19 publications

A Tri-Partite Neural Document Language Model for Semantic Information Retrieval

A Tri-Partite Neural Document Language Model for Semantic Information Retrieval

Neural Networks for Information Retrieval

Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations

Contact Info

Product

Resources

About