In supervised learning methods, a large amount of labeled data is necessary to find reliable classification boundaries to train a classifier. However, it is hard to obtain a large amount of labeled data in practice and it is time-consuming with a lot of cost to obtain labels of data. Although unlabeled data is comparatively plentiful than labeled data, most of supervised learning methods are not designed to exploit unlabeled data. Self-training is one of the semisupervised learning methods that alternatively repeat training a base classifier and labeling unlabeled data in training set. Most self-training methods have adopted confidence measures to select confidently labeled examples because high-confidence usually implies low error. A major difficulty of self-training is the error amplification. If a classifier misclassifies some examples and the misclassified examples are included in the labeled training set, the next classifier may learn improper classification boundaries and generate more misclassified examples. Since base classifiers are built with small labeled dataset and are hard to earn good generalization performance due to the small labeled dataset. Although improving training procedure and the performance of classifiers, error occurrence is inevitable, so corrections of self-labeled data are necessary to avoid error amplification in the following classifiers. In this paper, we propose a deep neural network based approach for alleviating the problems of self-training by combining schemes: pre-training, dropout and error forgetting. By applying combinations of these schemes to various dataset, a trained classifier using our approach shows improved performance than trained classifier using common self-training.
An approach for game bot detection in massively multiplayer online role‐playing games (MMORPGs) based on the analysis of game playing behavior is proposed. Since MMORPGs are large‐scale games, users can play in various ways. This variety in playing behavior makes it hard to detect game bots based on play behaviors. To cope with this problem, the proposed approach observes game playing behaviors of users and groups them by their behavioral similarities. Then, it develops a local bot detection model for each player group. Since the locally optimized models can more accurately detect game bots within each player group, the combination of those models brings about overall improvement. Behavioral features are selected and developed to accurately detect game bots with the low resolution data, considering common aspects of MMORPG playing. Through the experiment with the real data from a game currently in service, it is shown that the proposed local model approach yields more accurate results.
Imbalanced data situation is that there are unequal distributions of data samples between different classes. It usually poses a challenge to any classification methods as it becomes hard to learn and predict the minority class samples since there are too small number of minority instances compare to majority instances. One of approaches for imbalanced class problems is to oversample by generating synthetic samples around given minority instances based on their nearest neighbors, so that the numbers of major and minor instances are balanced. However, if nearest neighbors are wrongly chosen, it may cause overfitting or underfitting problems. We propose a novel oversampling method for efficiently handling imbalanced data problems. Our proposed method generates synthetic samples and decides whether to reject or accept it by considering the location of the synthetic samples. With our proposed method, we have observed the outperformed results obtained within the framework of real world imbalanced datasets. In addition, our proposed method is not sensitive to how to choose nearest neighbors for generating synthetic samples as much as the existing approaches for imbalance problem.
Group recommendation system is an important research area, because there are many situations where a group user take an item such as watching the movie, listening the music and watching the TV contents with their family or friends. In order to suggest group recommendation, understanding of group user and domain is necessary. However, there are few analysis of group recommendation system using real-world dataset, because most researches use synthetic dataset. In this paper, we apply the various methods to real-world dataset, and provide some guides for group recommendation system.
In this paper, a novel television (TV) program recommendation method is proposed by merging multiple preferences. We use channels and genres of programs, which is available information in standalone TVs, as features for the recommendation. The proposed method performs multi-time contextual profiling and constructs multiple-time contextual preference matrices of channels and genres. Since multiple preference models are constructed with different time contexts, there can be conflicts among them. In order to effectively merge the preferences with the minimum number of conflicts, we develop a quadratic programming model. The optimization problem is formulated with a minimum number of constraints so that the optimization process is scalable and fast even in a standalone TV with low computational power. Experiments with a real-world dataset prove that the proposed method is more efficient and accurate than other TV recommendation methods. Our method improves recommendation performance by 5-50% compared to the baselines.
Although researchers have proposed various recommendation systems, most recommendation approaches are for single users and there are only a small number of recommendation approaches for groups. However, TV programs or movies are most often viewed by groups rather than by single users. Most recommendation approaches for groups assume that single users' profiles are known and that group profiles consist of the single users' profiles. However, because it is difficult to obtain group profiles, researchers have only used synthetic or limited datasets. In this paper, we report on various group recommendation approaches to a real largescale dataset in a TV domain, and evaluate the various group recommendation approaches. In addition, we provide some guidelines for group recommendation systems, focusing on home group users in a TV domain.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.