Enhancing Knowledge Graph Embedding with Probabilistic Negative Sampling

Kanojia, Vibhor; Maeda, Hideyuki; Togashi, Riku; Fujita, Sumio

doi:10.1145/3041021.3054238

Cited by 12 publications

(10 citation statements)

References 3 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…if t is a fixed constant. The rate decreases to (N log N) for the choice of t producing (17). It is also implied by (17)…”

Section: Upper Boundsmentioning

confidence: 87%

“…Let the number of entities N → ∞ and C, K, U, d E , d R , α, γ be absolute constants. If Assumptions 1 and 2 hold and ρ 1 + ρ 2 = o(log N), then asymptotic inequalities (16), (17), and (18) in Theorem 1 hold.…”

Section: Theoremmentioning

confidence: 99%

“…Various forms of f are proposed, such as distance models [7], bilinear models [12][13][14], and neural networks [15]. Computational algorithms are proposed to improve link prediction for knowledge bases [16,17]. The statistical properties of the embedding-based MRN models have not been rigorously studied.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Statistical Analysis of Multi-Relational Network Recovery

Wang

Tang

Liu

2020

Front. Appl. Math. Stat.

View full text Add to dashboard Cite

In this paper, we develop asymptotic theories for a class of latent variable models for large-scale multi-relational networks. In particular, we establish consistency results and asymptotic error bounds for the (penalized) maximum likelihood estimators when the size of the network tends to infinity. The basic technique is to develop a non-asymptotic error bound for the maximum likelihood estimators through large deviations analysis of random fields. We also show that these estimators are nearly optimal in terms of minimax risk.

show abstract

“…if t is a fixed constant. The rate decreases to (N log N) for the choice of t producing (17). It is also implied by (17)…”

Section: Upper Boundsmentioning

confidence: 87%

Section: Theoremmentioning

confidence: 99%

See 1 more Smart Citation

Statistical Analysis of Multi-Relational Network Recovery

Wang

Tang

Liu

2020

Front. Appl. Math. Stat.

View full text Add to dashboard Cite

show abstract

“…Kanojia et al [46] proposes probabilistic negative sampling to address the issue of skewed data that commonly exists in knowledge bases. For relations with less data, Uniform or Bernoulli random sampling fails to predict the missing part of golden triples among semantically possible options even after hundreds of epochs of training.…”

Section: Probabilistic Samplingmentioning

confidence: 99%

Understanding Negative Sampling in Knowledge Graph Embedding

Qian¹,

Li²,

Atkinson³

et al. 2021

IJAIA

View full text Add to dashboard Cite

Knowledge graph embedding (KGE) is to project entities and relations of a knowledge graph (KG) into a low-dimensional vector space, which has made steady progress in recent years. Conventional KGE methods, especially translational distance-based models, are trained through discriminating positive samples from negative ones. Most KGs store only positive samples for space efficiency. Negative sampling thus plays a crucial role in encoding triples of a KG. The quality of generated negative samples has a direct impact on the performance of learnt knowledge representation in a myriad of downstream tasks, such as recommendation, link prediction and node classification. We summarize current negative sampling approaches in KGE into three categories, static distribution-based, dynamic distribution-based and custom cluster-based respectively. Based on this categorization we discuss the most prevalent existing approaches and their characteristics. It is a hope that this review can provide some guidelines for new thoughts about negative sampling in KGE.

show abstract

“…Kanojia et al [44]proposes probabilistic negative sampling to address the issue of skewed data that commonly exists in knowledge bases. For relations with less data, Uniform or Bernoulli random sampling fails to predict the missing part of golden triplets among semantically possible options even after hundreds of epochs of training.…”

Section: Probabilistic Samplingmentioning

confidence: 99%

Negative Sampling in Knowledge Representation Learning: A Mini-Review

Qian¹,

Li²,

Atkinson³

et al. 2020

Computer Science &Amp; Information Technology (CS &Amp; IT)

View full text Add to dashboard Cite

Knowledge representation learning (KRL) aims at encoding components of a knowledge graph (KG) into a low-dimensional continuous space, which has brought considerable successes in applying deep learning to graph embedding. Most famous KGs contain only positive instances for space efficiency. Typical KRL techniques, especially translational distance-based models, are trained through discriminating positive and negative samples. Thus, negative sampling is unquestionably a non-trivial step in KG embedding. The quality of generated negative samples can directly influence the performance of final knowledge representations in downstream tasks, such as link prediction and triple classification. This review summarizes current negative sampling methods in KRL and we categorize them into three sorts, fixed distribution-based, generative adversarial net (GAN)-based and cluster sampling. Based on this categorization we discuss the most prevalent existing approaches and their characteristics.

show abstract

Enhancing Knowledge Graph Embedding with Probabilistic Negative Sampling

Cited by 12 publications

References 3 publications

Statistical Analysis of Multi-Relational Network Recovery

Statistical Analysis of Multi-Relational Network Recovery

Understanding Negative Sampling in Knowledge Graph Embedding

Negative Sampling in Knowledge Representation Learning: A Mini-Review

Contact Info

Product

Resources

About