2017
DOI: 10.1007/978-3-319-69548-8_21
|View full text |Cite
|
Sign up to set email alerts
|

DBpedia Entity Type Detection Using Entity Embeddings and N-Gram Models

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0
1

Year Published

2018
2018
2023
2023

Publication Types

Select...
4
2

Relationship

1
5

Authors

Journals

citations
Cited by 7 publications
(6 citation statements)
references
References 18 publications
0
5
0
1
Order By: Relevance
“…This paradigm requires to provide gold type labels for entities and train binary or multi-class classifiers to classify entities to the right types. Some previous studies [5,36] proposed to use supervised classification for typing error detection, but it is hard to scale as the number of types is large for many KGs (in total 778 types in DBpedia). One work [5] tried to tackle the scalability problem by another entity type dataset of better quality, but this could not fundamentally solve the issue as external datasets are also noisy and may be unavailable.…”
Section: Classificationmentioning
confidence: 99%
See 1 more Smart Citation
“…This paradigm requires to provide gold type labels for entities and train binary or multi-class classifiers to classify entities to the right types. Some previous studies [5,36] proposed to use supervised classification for typing error detection, but it is hard to scale as the number of types is large for many KGs (in total 778 types in DBpedia). One work [5] tried to tackle the scalability problem by another entity type dataset of better quality, but this could not fundamentally solve the issue as external datasets are also noisy and may be unavailable.…”
Section: Classificationmentioning
confidence: 99%
“…Data-driven approaches to deal with typing errors in factual KGs have a very broad spectrum, covering fully unsupervised clustering and outlier detection [1,21], semi-supervised noise models that could leverage noisy labels [10,12,14], and supervised noise detection methods that fully rely on gold labels [5,36]. In this study, we present a taxonomy of the KG typing error detection paradigms and comprehensively evaluate those paradigms on DBpedia.…”
Section: Introductionmentioning
confidence: 99%
“…2. This paper introduces novel techniques in clickstream data analytics to unleash key customer journeys through pattern mining using the n-grams and Student T-Test, which distinguishes between regular patterns and special sequences [40,48]. A model is proposed to predict users' transition from one state to another based on the higher-order Markov chains.…”
Section: A Case Study Through Clickstream Data Analysismentioning
confidence: 99%
“…早期的方法一般是将实体与实体类型作为三元组的头尾实体, 谓词就是 type, 这样构成的 RDF 三元组可以利用嵌入式学习过程完成向量表示的学习, 从而完成类型预测任务. 但是仅仅简单地将实 体的一个类型作为尾实体而言会损失很多信息, 比如实体所在文本的上下文环境信息, 外部知识库中 对该类实体的描述等, 而且一个实体的类型是多样且有层次的, 所以对某个实体的相关文本也可以做 嵌入式学习 [69,70] , 比如将其本身及上下文环境都变为低维向量的表示形式, 然后将这些低维向量输入 到深度学习模型中 [71,72] , 从而使类型的推理简化为利用神经网络模型来执行的类别判断, 此时在网络 中也还可以引入注意力机制, 该机制可以来自于自然语言处理技术中常用的条件约束, 也可以使用知 识库中已知的类型层次结构信息来产生, 总之就是要引入外部信息 [72] 来改进嵌入式方法的效果. [75] 使用后向传播方法来完成权值优化, 但需要非常多次迭代才能收敛; Minkov 和 Cohen [76] 在 2008 年提出了基于生成学习模型的随机游走策略以使路径上的实体更相关; 特别在 2010 年, Lao 等 [77] 提出了代表性的路径排序算法 PRA (path ranking algorithm), 其优化了边参数化随机游走模型, 并增加了约束以提高计算效率.…”
Section: 基于表示学习的类型推理机制unclassified