This study deals with the missing link prediction problem: the problem of predicting the existence of missing connections between entities of interest. We address link prediction using coupled analysis of relational datasets represented as heterogeneous data, i.e., datasets in the form of matrices and higher-order tensors. We propose to use an approach based on probabilistic interpretation of tensor factorisation models, i.e., Generalised Coupled Tensor Factorisation, which can simultaneously fit a large class of tensor models to higher-order tensors/matrices with common latent factors using different loss functions. Numerical experiments demonstrate that joint analysis of data from multiple sources via coupled factorisation improves the link prediction performance and the selection of right loss function and tensor model is crucial for accurately predicting missing links.
Özetçe -Bu çalışmada gözlemlenemeyen baglantı tahmini problemi için Genelleştirilmiş Baglaşımlı Tensör Ayrışımı (GBTA) çerçevesinde tanımlanmış modeller sunulmaktadır. GBTA ortak tensörler içeren modellerin eşzamanlı ayrışımı ile ortak saklı faktörler çıkarabilen bir algoritmik çerçevedir. Bu bildiride GBTA çerçevesine ek olarak simetrik yapıdaki matrislerin ayrışımı için kullanılan algoritma saglanmaktadır. Heterojen verilerin ayrışımında kullanılan önceki çalışmalar ya tek bir ıraksaya ya da belirli bir tensör ayrışım modeline odaklanmaktadır; ancak, heterojen veri analizinde temel zorluklardan biri dogru tensör modelini ve ıraksayı bulmaktır. Bu nedenle, bu çalışmada farklı tensör modelleri ve ıraksaylar ele alınmaktadır. Gerçek veri kümeleri üzerinde gerçekleştirilen deneyler birden fazla kaynaktan gelen verilerin baglaşımlı tensor ayrışım yöntemi ile ortak analizinin ve simetrik yapıdaki verilerin baglaşımlı modellere dahil edilmesinin baglantı tahmin performansını artırmakta oldugunu; ayrıca dogru ıraksay ve tensör model seçiminin önemini göstermektedir.Anahtar Kelimeler-Baglaşımlı tensör ayrışımı; Baglantı Tahmini; Eksik veri; Veri Birleştirme; Simetrik Matris.Abstract-This study deals with the missing link prediction, the problem of predicting the existence of missing connections between entities of interest. Link prediction is addressed using coupled analysis of relational datasets represented by several matrices, including symmetric ones and multiway arrays, that will be simply called tensors. We propose to use an approach based on probabilistic interpretation of tensor factorisation models, i.e., Generalised Coupled Tensor Factorisation (GCTF), which can simultaneously fit a large class of tensor models to higher-order tensors/matrices with common latent factors using different loss functions. In addition, we propose the algorithm for factorization of symmetric matrices. Numerical experiments demonstrate that joint analysis of data from multiple sources via coupled factorisation and integration of symmetric matrices to models improves the link prediction performance and the selection of right loss function and tensor model is crucial for accurately predicting missing links.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.