“…[SACRA] Li R. et al [68] propose a novel recommender model, named Click Feedback-Aware Network (CFAN), to provide query suggestions considering the sequential search queries issued by the user and her history of clicks. The authors employ additional adversarial (re)training epochs (i.e., adding adversarial perturbations on item embeddings) to improve the robustness of the model.…”
Section: Adversarial Machine Learning For Attack and Defense On Rsmentioning
Latent-factor models (LFM) based on collaborative filtering (CF), such as matrix factorization (MF) and deep CF methods, are widely used in modern recommender systems (RS) due to their excellent performance and recommendation accuracy. Notwithstanding their great success, in recent years, it has been shown that these methods are vulnerable to adversarial examples, i.e., subtle but non-random perturbations designed to force recommendation models to produce erroneous outputs. The main reason for this behavior is that user interaction data used for training of LFM can be contaminated by malicious activities or users' misoperation that can induce an unpredictable amount of natural noise and harm recommendation outcomes. On the other side, it has been shown that these systems, conceived originally to attack machine learning applications, can be successfully adopted to strengthen their robustness against attacks as well as to train more precise recommendation engines.In this respect, the goal of this survey is two-fold: (i) to present recent advances on AML-RS for the security of RS (i.e., attacking and defense recommendation models), (ii) to show another successful application of AML in generative adversarial networks (GANs), which use the core concept of learning in AML (i.e., the min-max game) for generative applications.In this survey, we provide an exhaustive literature review of 60 articles published in major RS and ML journals and conferences. This review serves as a reference for the RS community, working on the security of RS and recommendation models leveraging generative models to improve their quality.
“…[SACRA] Li R. et al [68] propose a novel recommender model, named Click Feedback-Aware Network (CFAN), to provide query suggestions considering the sequential search queries issued by the user and her history of clicks. The authors employ additional adversarial (re)training epochs (i.e., adding adversarial perturbations on item embeddings) to improve the robustness of the model.…”
Section: Adversarial Machine Learning For Attack and Defense On Rsmentioning
Latent-factor models (LFM) based on collaborative filtering (CF), such as matrix factorization (MF) and deep CF methods, are widely used in modern recommender systems (RS) due to their excellent performance and recommendation accuracy. Notwithstanding their great success, in recent years, it has been shown that these methods are vulnerable to adversarial examples, i.e., subtle but non-random perturbations designed to force recommendation models to produce erroneous outputs. The main reason for this behavior is that user interaction data used for training of LFM can be contaminated by malicious activities or users' misoperation that can induce an unpredictable amount of natural noise and harm recommendation outcomes. On the other side, it has been shown that these systems, conceived originally to attack machine learning applications, can be successfully adopted to strengthen their robustness against attacks as well as to train more precise recommendation engines.In this respect, the goal of this survey is two-fold: (i) to present recent advances on AML-RS for the security of RS (i.e., attacking and defense recommendation models), (ii) to show another successful application of AML in generative adversarial networks (GANs), which use the core concept of learning in AML (i.e., the min-max game) for generative applications.In this survey, we provide an exhaustive literature review of 60 articles published in major RS and ML journals and conferences. This review serves as a reference for the RS community, working on the security of RS and recommendation models leveraging generative models to improve their quality.
“…whereΘ denotes a set of current model parameters. As it is difficult to get the exact optimal solutions of ∆ adv , we employ the fast gradient method proposed in [8], a common choice in adversarial training [12,18,19,22], to estimate ∆ adv . The idea is to approximate the objective function around ∆ as a linear function.…”
Automatic speaker recognition (ASR) is a stepping-stone technology towards semantic multimedia understanding and benefits versatile downstream applications. In recent years, neural network-based ASR methods have demonstrated remarkable power to achieve excellent recognition performance with sufficient training data. However, it is impractical to collect sufficient training data for every user, especially for fresh users. Therefore, a large portion of users usually has a very limited number of training instances. As a consequence, the lack of training data prevents ASR systems from accurately learning users acoustic biometrics, jeopardizes the downstream applications, and eventually impairs user experience.In this work, we propose an adversarial few-shot learning-based speaker identification framework (AFEASI ) to develop robust speaker identification models with only a limited number of training instances. We first employ metric learning-based few-shot learning to learn speaker acoustic representations, where the limited instances are comprehensively utilized to improve the identification performance. In addition, adversarial learning is applied to further enhance the generalization and robustness for speaker identification with adversarial examples. Experiments conducted on a publicly available large-scale dataset demonstrate that AFEASI significantly outperforms eleven baseline methods. An in-depth analysis further indicates both effectiveness and robustness of the proposed method.
“…Size Publicly Available Citations AOL 16M queries 3M sessions Yes [38], [21], [11], [1], [12], [37], [40], [30], [31], [8], [35], [7], [6], [5], [19], [34], [10], [15], [20], [11], [13] MS MARCO 1M queries Yes [1], [40], [6], [5] Yahoo Search Engine 4M queries 549K sessions No [25] Tencent website 160M queries No [17], [16] "Baidu Knows" Website 85K pairs of (question, best answer) No [27] later switches to searching about dogs, which shows gradual topic drift revolving around the abstract concept of animals. For this reason, we believe that a gold standard dataset of queries is required that would not rely on the weak assumption of gradual query improvement within the same session.…”
Section: Dataset Namementioning
confidence: 99%
“…The objective of query refinement is to deduce the intent of the users' query and then formulate an alternative set of queries in order to fill the semantic gap between the input query and that of the documents. More recently, neural based models have received more attention for performing the supervised query refinement task [16,25,38]. Such approaches require high-quality training data to learn translations from the user query to an improved revised query.…”
In this paper, we implement and publicly share a configurable software workflow and a collection of gold standard datasets for training and evaluating supervised query refinement methods. Existing datasets such as AOL and MS MARCO, which have been extensively used in the literature, are based on the weak assumption that users' input queries improve gradually within a search session, i.e., the last query where the user ends her information seeking session is the best reconstructed version of her initial query. In practice, such an assumption is not necessarily accurate for a variety of reasons, e.g., topic drift. The objective of our work is to enable researchers to build gold standard query refinement datasets without having to rely on such weak assumptions. Our software workflow, which generates such gold standard query datasets, takes three inputs: (1) a dataset of queries along with their associated relevance judgements (e.g. TREC topics), (2) an information retrieval method (e.g., BM25), and (3) an evaluation metric (e.g., MAP), and outputs a gold standard dataset. The produced gold standard dataset includes a list of revised queries for each query in the input dataset, each of which effectively improves the performance of the specified retrieval method in terms of the desirable evaluation metric. Since our workflow can be used to generate gold standard datasets for any input query set, in this paper, we have generated and publicly shared gold standard datasets for TREC queries associated with Robust04, Gov2, ClueWeb09, and ClueWeb12. The source code of our software workflow, the generated gold datasets, and benchmark results for three state-of-the-art supervised query refinement methods over these datasets are made publicly available for reproducibility purposes.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.