Relaxed softmax for PU learning

Tanielian, Ugo; Vasile, Flavian

doi:10.1145/3298689.3347034

Cited by 9 publications

(5 citation statements)

References 26 publications

(28 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…predicting missing elements of the MovieLens dataset (Harper and Konstan 2015)). Maximum likelihood estimation also suffers from a computational cost that scales in O(P ) but the problem has a different mathematical form to policy learning and methods developed in the maximum likelihood context (Tanielian and Vasile 2019;Gutmann and Hyvärinen 2010;Rendle et al 2009) cannot be applied to policy learning.…”

Section: Related Workmentioning

confidence: 99%

Fast Offline Policy Optimization for Large Scale Recommendation

Sakhi

Rohde

Gilotte

2023

AAAI

View full text Add to dashboard Cite

Personalised interactive systems such as recommender systems require selecting relevant items from massive catalogs dependent on context. Reward-driven offline optimisation of these systems can be achieved by a relaxation of the discrete problem resulting in policy learning or REINFORCE style learning algorithms. Unfortunately, this relaxation step requires computing a sum over the entire catalogue making the complexity of the evaluation of the gradient (and hence each stochastic gradient descent iterations) linear in the catalogue size. This calculation is untenable in many real world examples such as large catalogue recommender systems, severely limiting the usefulness of this method in practice. In this paper, we derive an approximation of these policy learning algorithms that scale logarithmically with the catalogue size. Our contribution is based upon combining three novel ideas: a new Monte Carlo estimate of the gradient of a policy, the self normalised importance sampling estimator and the use of fast maximum inner product search at training time. Extensive experiments show that our algorithm is an order of magnitude faster than naive approaches yet produces equally good policies.

show abstract

Section: Related Workmentioning

confidence: 99%

Fast Offline Policy Optimization for Large Scale Recommendation

Sakhi

Rohde

Gilotte

2023

AAAI

View full text Add to dashboard Cite

show abstract

“…Naturally, there is no necessity for OVR classification for DNNs, since they already support multi-class classification by design. However, there are common situations where OVR becomes relevant, e.g., if faced with only positively labeled samples and all remaining samples with potentially unknown sources are assigned to a single negative class [16,31,66] or if the goal, as in this paper, is to filter normal samples from abnormal ones [6]. As motivated previously, vanilla MLPs are unsuitable in this case due to their infinite open space risk and consequential robustness deficiencies toward outliers [6].…”

Section: Related Workmentioning

confidence: 99%

Bounding open space risk with decoupling autoencoders in open set recognition

Lübbering

Gebauer

Ramamurthy

et al. 2022

Int J Data Sci Anal

View full text Add to dashboard Cite

One-vs-Rest (OVR) classification aims to distinguish a single class of interest (COI) from other classes. The concept of novelty detection and robustness to dataset shift becomes crucial in OVR when the scope of the rest class is extended from the classes observed during training to unseen and possibly unrelated classes, a setting referred to as open set recognition (OSR). In this work, we propose a novel architecture, namely decoupling autoencoder (DAE), which provides a proven upper bound on the open space risk and minimizes open space risk via a dedicated training routine. Our method is benchmarked within three different scenarios, each isolating different aspects of OSR, namely plain classification, outlier detection, and dataset shift. The results conclusively show that DAE achieves robust performance across all three tasks. This level of cross-task robustness is not observed for any of the seven potent baselines from the OSR, OVR, outlier detection, and ensembling domain which, apart from ATA (Lübbering et al., From imbalanced classification to supervised outlier detection problems: adversarially trained auto encoders. In: Artificial neural networks and machine learning—ICANN 2020, 2020), tend to fail on either one of the tasks. Similar to DAE, ATA is based on autoencoders and facilitates the reconstruction error to predict the inlierness of a sample. However unlike DAE, it does not provide any uncertainty scores and therefore lacks rudimentary means of interpretation. Our adversarial robustness and local stability results further support DAE’s superiority in the OSR setting, emphasizing its applicability in safety-critical systems.

show abstract

“…For quantifying the distance between the true and estimated distributions, Kullback-Leibler divergence (KLD) [35] is often utilised. Tanielian et al [56] recently proposed a DE approach based on the maximum likelihood estimation (MLE) of softmax density functions. However, their MLE formulation leads to intractable log-partition log 𝑖 ∈I exp(𝑓 𝑢 (𝑖)) in the loglikelihood function, which makes SGD-based optimisation difficult for large-scale settings.…”

Section: Paradigms Of Personalised Rankingmentioning

confidence: 99%

Scalable Personalised Item Ranking through Parametric Density Estimation

Togashi,

Kato,

Otani

et al. 2021

Preprint

View full text Add to dashboard Cite

Learning from implicit feedback is challenging because of the difficult nature of the one-class problem: we can observe only positive examples. Most conventional methods use a pairwise ranking approach and negative samplers to cope with the one-class problem. However, such methods have two main drawbacks particularly in large-scale applications; (1) the pairwise approach is severely inefficient due to the quadratic computational cost; and (2) even recent model-based samplers (e.g. IRGAN) cannot achieve practical efficiency due to the training of an extra model.In this paper, we propose a learning-to-rank approach, which achieves convergence speed comparable to the pointwise counterpart while performing similarly to the pairwise counterpart in terms of ranking effectiveness. Our approach estimates the probability densities of positive items for each user within a rich class of distributions, viz. exponential family. In our formulation, we derive a loss function and the appropriate negative sampling distribution based on maximum likelihood estimation. We also develop a practical technique for risk approximation and a regularisation scheme. We then discuss that our single-model approach is equivalent to an IRGAN variant under a certain condition. Through experiments on real-world datasets, our approach outperforms the pointwise and pairwise counterparts in terms of effectiveness and efficiency. CCS CONCEPTS• Information systems → Recommender systems; Learning to rank; • Computing methodologies → Learning from implicit feedback.

show abstract

Relaxed softmax for PU learning

Cited by 9 publications

References 26 publications

Fast Offline Policy Optimization for Large Scale Recommendation

Fast Offline Policy Optimization for Large Scale Recommendation

Bounding open space risk with decoupling autoencoders in open set recognition

Scalable Personalised Item Ranking through Parametric Density Estimation

Contact Info

Product

Resources

About