Session-based recommendation has received growing attention recently due to the increasing privacy concern. Despite the recent success of neural session-based recommenders, they are typically developed in an offline manner using a static dataset. However, recommendation requires continual adaptation to take into account new and obsolete items and users, and requires "continual learning" in real-life applications. In this case, the recommender is updated continually and periodically with new data that arrives in each update cycle, and the updated model needs to provide recommendations for user activities before the next model update. A major challenge for continual learning with neural models is catastrophic forgetting, in which a continually trained model forgets user preference patterns it has learned before. To deal with this challenge, we propose a method called Adaptively Distilled Exemplar Replay (ADER) by periodically replaying previous training samples (i.e., exemplars) to the current model with an adaptive distillation loss. Experiments are conducted based on the state-of-the-art SASRec model using two widely used datasets to benchmark ADER with several well-known continual learning techniques. We empirically demonstrate that ADER consistently outperforms other baselines, and it even outperforms the method using all historical data at every update cycle. This result reveals that ADER is a promising solution to mitigate the catastrophic forgetting issue towards building more realistic and scalable session-based recommenders.
CCS CONCEPTS• Computing methodologies → Neural networks; Online learning settings.
Although bird swarm optimization (BSA) algorithm shows excellent performance in solving continuous optimization problems, it is not an easy task to apply it solving the combination optimization problem such as traveling salesman problem (TSP). Therefore, this paper proposed a novel discrete BSA based on information entropy matrix (DBSA) for TSP. Firstly, in the DBSA algorithm, the information entropy matrix is constructed as a guide for generating new solutions. Each element of the information entropy matrix denotes the information entropy from city i to city j. The higher the information entropy, the larger the probability that a city will be visited. Secondly, each TSP path is represented as an array, and each element of the array represents a city index. Then according to the needs of the minus function proposed in this paper, each TSP path is transformed into a Boolean matrix which represents the relationship of edges. Third, the minus function is designed to evaluate the difference between two Boolean matrices. Based on the minus function and information entropy matrix, the birds’ position updating equations are redesigned to update the information entropy matrix without changing the original features. Then three TSP operators are proposed to generate new solutions according to the updated information entropy matrix. Finally, the performance of DBSA algorithm was tested on a large number of benchmark TSP instances. Experimental results show that DBSA algorithm is better or competitively outperforms many state-of-the-art metaheuristic algorithms.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.