CSE: Conceptual Sentence Embeddings based on Attention Model

Wang, Yashen; Huang, Heyan; Feng, Chong; Zhou, Qiang; Gu, Jiahui; Gao, Xiong

doi:10.18653/v1/p16-1048

Cited by 47 publications

(20 citation statements)

References 18 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Wellknown examples include word2vec (Mikolov et al, 2013), GloVe (Pennington et al, 2014), or fastText . Approaches for learning sentence embeddings have also been introduced, including SkipThought (Kiros et al, 2015), ParagraphVector (Le and Mikolov, 2014), Conceptual Sentence Embedding (Wang et al, 2016), Sequential Denoising Autoencoders (Hill et al, 2016) or Fast-Sent (Hill et al, 2016). In a comparison of unsupervised sentence embedding models, Hill et al (2016) show that the optimal embedding critically depends on the targeted downstream task.…”

Section: Related Workmentioning

confidence: 99%

Self-Attentive, Multi-Context One-Class Classification for Unsupervised Anomaly Detection on Text

Ruff¹,

Zemlyanskiy²,

Vandermeulen³

et al. 2019

Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

195

View full text Add to dashboard Cite

There exist few text-specific methods for unsupervised anomaly detection, and for those that do exist, none utilize pre-trained models for distributed vector representations of words. In this paper we introduce a new anomaly detection method-Context Vector Data Description (CVDD)-which builds upon word embedding models to learn multiple sentence representations that capture multiple semantic contexts via the self-attention mechanism. Modeling multiple contexts enables us to perform contextual anomaly detection of sentences and phrases with respect to the multiple themes and concepts present in an unlabeled text corpus. These contexts in combination with the self-attention weights make our method highly interpretable. We demonstrate the effectiveness of CVDD quantitatively as well as qualitatively on the wellknown Reuters, 20 Newsgroups, and IMDB Movie Reviews datasets.

show abstract

Section: Related Workmentioning

confidence: 99%

Self-Attentive, Multi-Context One-Class Classification for Unsupervised Anomaly Detection on Text

Ruff¹,

Zemlyanskiy²,

Vandermeulen³

et al. 2019

Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

195

View full text Add to dashboard Cite

show abstract

“…Trends over time between visualization and data mining are revealed through spark lines appearing beside the concept label. Since the total number varies widely across concepts, we normalized the spark lines for each concept so that they reveal the relative number of papers [327], [335], [339] word/phrase/entity-level (679) [4], [286], [289] document-level (288) [14], [252], [376] hybrid (387) [40], [155], [349] model inference (1335) non-probabilistic inference (267) [91], [101], [336] probabilistic inference (1160) [185], [198], [360] modeling (3085) models for classification (1636) [45], [176], [216] models for clustering (908) [300], [301], [361] models for dimension reduction (247) [129], [221], [287] topic models (1089) [32], [33], [139] models for regression (256) [49], [149], [298] language model (271) [24], [89], [158] graphical models (187) [262], [268], [367] neural networks (412) [175], [193], [224] mixture models (128)…”

Section: Visualization Of Concept Relationsmentioning

confidence: 99%

Bridging Text Visualization and Mining: A Task-Driven Survey

Liu

Wang

Collins

et al. 2019

IEEE Trans. Visual. Comput. Graphics

View full text Add to dashboard Cite

Visual text analytics has recently emerged as one of the most prominent topics in both academic research and the commercial world. To provide an overview of the relevant techniques and analysis tasks, as well as the relationships between them, we comprehensively analyzed 263 visualization papers and 4,346 mining papers published between 1992-2017 in two fields: visualization and text mining. From the analysis, we derived around 300 concepts (visualization techniques, mining techniques, and analysis tasks) and built a taxonomy for each type of concept. The co-occurrence relationships between the concepts were also extracted. Our research can be used as a stepping-stone for other researchers to 1) understand a common set of concepts used in this research topic; 2) facilitate the exploration of the relationships between visualization techniques, mining techniques, and analysis tasks; 3) understand the current practice in developing visual text analytics tools; 4) seek potential research opportunities by narrowing the gulf between visualization and mining techniques based on the analysis tasks; and 5) analyze other interdisciplinary research areas in a similar way. We have also contributed a web-based visualization tool for analyzing and understanding research trends and opportunities in visual text analytics.

show abstract

“…E.g., [19] introduced an entity-level masking strategy to ensure that all of the words in the same entity were masked during word representation training, instead of only one word or character being masked; [20] updated contextual word representations via a form of wordto-entity attention, by inserting prior knowledge into a deep neural model. On the other hand, previous work has demonstrated that, leveraging extra lexical knowledge (e.g., concept etc.,) can significantly boost the efficiency of contextualized embeddings for word [15], entity [21], relation [22], sentence [23], and so on. Overall, the lexicon-enhanced contextualized embedding representation produced by these models has produced substantial gains in a number of downstream NLP tasks.…”

Section: Related Work and Motivation A Unsupervised Pre-trainingmentioning

confidence: 99%

Leveraging Concept-Enhanced Pre-Training Model and Masked-Entity Language Model for Named Entity Disambiguation

Dai

Pang

et al. 2020

IEEE Access

View full text Add to dashboard Cite

Named Entity Disambiguation (NED) refers to the task of resolving multiple named entity mentions in an input-text sequence to their correct references in a knowledge graph. We tackle NED problem by leveraging two novel objectives for pre-training framework, and propose a novel pre-training NED model. Especially, the proposed pre-training NED model consists of: (i) concept-enhanced pre-training, aiming at identifying valid lexical semantic relations with the concept semantic constraints derived from external resource Probase; and (ii) masked entity language model, aiming to train the contextualized embedding by predicting randomly masked entities based on words and non-masked entities in the given input-text. Therefore, the proposed pre-training NED model could merge the advantage of pre-training mechanism for generating contextualized embedding with the superiority of the lexical knowledge (e.g., concept knowledge emphasized here) for understanding language semantic. We conduct experiments on the CoNLL dataset and TAC dataset, and various datasets provided by GERBIL platform. The experimental results demonstrate that the proposed model achieves significantly higher performance than previous models. INDEX TERMS Named entity disambiguation, pre-training, lexical knowledge.

show abstract

CSE: Conceptual Sentence Embeddings based on Attention Model

Cited by 47 publications

References 18 publications

Self-Attentive, Multi-Context One-Class Classification for Unsupervised Anomaly Detection on Text

Self-Attentive, Multi-Context One-Class Classification for Unsupervised Anomaly Detection on Text

Bridging Text Visualization and Mining: A Task-Driven Survey

Leveraging Concept-Enhanced Pre-Training Model and Masked-Entity Language Model for Named Entity Disambiguation

Contact Info

Product

Resources

About