2019
DOI: 10.1007/978-3-030-15712-8_20
|View full text |Cite
|
Sign up to set email alerts
|

Word Embeddings for Entity-Annotated Texts

Abstract: Learned vector representations of words are useful tools for many information retrieval and natural language processing tasks due to their ability to capture lexical semantics. However, while many such tasks involve or even rely on named entities as central components, popular word embedding models have so far failed to include entities as first-class citizens. While it seems intuitive that annotating named entities in the training corpus should result in more intelligent word features for downstream tasks, pe… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
10
0

Year Published

2020
2020
2021
2021

Publication Types

Select...
4
1

Relationship

0
5

Authors

Journals

citations
Cited by 6 publications
(10 citation statements)
references
References 36 publications
0
10
0
Order By: Relevance
“…Toutanova et al (2015) extract dependency paths from sentences and jointly embed them with a KG using DistMult (Yang et al, 2015) to support the relation extraction task. Several other approaches focus on jointly embedding words, entities (Yamada et al, 2017;Newman-Griffis et al, 2018;Cao et al, 2017;Almasian et al, 2019) and entity types (Gupta et al, 2017) appearing in the same textual contexts without considering relational structure of a KG. These ap-proaches are employed in monolingual NLP tasks including entity linking (Gupta et al, 2017;Cao et al, 2017), entity abstraction (Newman-Griffis et al, 2018) and factoid QA (Yamada et al, 2017).…”
Section: Related Workmentioning
confidence: 99%
“…Toutanova et al (2015) extract dependency paths from sentences and jointly embed them with a KG using DistMult (Yang et al, 2015) to support the relation extraction task. Several other approaches focus on jointly embedding words, entities (Yamada et al, 2017;Newman-Griffis et al, 2018;Cao et al, 2017;Almasian et al, 2019) and entity types (Gupta et al, 2017) appearing in the same textual contexts without considering relational structure of a KG. These ap-proaches are employed in monolingual NLP tasks including entity linking (Gupta et al, 2017;Cao et al, 2017), entity abstraction (Newman-Griffis et al, 2018) and factoid QA (Yamada et al, 2017).…”
Section: Related Workmentioning
confidence: 99%
“…By annotating the text with named entity information before training the model, unique multi-word entries in the dictionary directly relate to known entities. Almasian et al propose such a model for entity-annotated texts [2]. Other interesting approaches build networks of co-occurring words and entities.…”
Section: Semantic Exploration Using Visualizationsmentioning
confidence: 99%
“…Auditing firms and law enforcement need to sift through massive amounts of data to gather evidence of criminal activity, often involving communication networks and documents [28]. Current computer-aided exploration tools, 2 offer a wide range of features from data ingestion, exploration, analysis, to visualization. This way, users can quickly navigate the underlying data based on extracted attributes, which would otherwise be infeasible due to the often large amount of heterogeneous data.…”
Section: Introductionmentioning
confidence: 99%
“…. , m N } be the set of N mentions contained in D, and E be the set of entities in the reference KG G. A low-dimensional representation (embedding) can be learned for each entity by applying node representation learning techniques such as DEEPWALK (Perozzi et al, 2014) to the graph G. The entity embeddings learned by these techniques are known to be meaningful with respect to the relatedness of the entities they represent (Almasian et al, 2019). The EL task consists in finding, for each mention m ∈ M D , the entity e ∈ E to which it refers.…”
Section: Problem and Notationmentioning
confidence: 99%
“…If the number of eigenthemes is chosen to be small, these components will constitute a good basis to approximate the dense region of the document embedding matrix. From the fundamental assumption of the existence of topical relatedness across the gold entities in a document and that such relatedness is captured by their corresponding embeddings (Almasian et al, 2019), the gold entities will form a dense region and, consequently, will define the subspace. However, this is only possible if there is no other subset of candidate entities whose relatedness is larger than that of the set of gold entities.…”
Section: Entity Linking With Eigenthemesmentioning
confidence: 99%