Abstract. In this paper, we apply support vector machine (SVM) to knowledge discovery (KD) and confirm its effectiveness with a benchmark data set. SVM has been successfully applied to problems in various domains. However, its effectiveness as a KD method is unknown. We propose SVM for KD, which deals with a classification problem with a binary class, by rescaling each attribute based on z-scores. SVM for KD can sort attributes with respect to their effectiveness in discriminating classes. Moreover, SVM for KD can discover crucial examples for discrimination. We settled six discovery tasks with the meningoencephalitis data set, which is a benchmark data set in KD. A domain expert ranked the discovery outcomes of SVM for KD from one to five with respect to several criteria. Selected attributes in six tasks are all valid and useful: their average scores are 3.8-4.0. Discovering order of attributes about usefulness represents a challenging problem. However, concerning this problem, our method achieved a score of more than or equal to 4.0 in three tasks. Besides, crucial examples for discrimination and typical examples for each class agree with medical knowledge. These promising results demonstrate the effectiveness of our approach.
Abstract. In this paper, we argue that a home-use autonomous mobile robot is a platform for a new kind of Intelligent Data Analysis (IDA). Recent advancement of hardware and software for robotics have enabled us to construct a small yet powerful, autonomous mobile robot from components in low cost. Such a robot is able to perform machine learning and data mining in the real world for a long period, which opens a new avenue for IDA. This paper improves and studies one of our monitoring robots in detail to reveal promising directions and challenges inherent in the new kind of IDA.
Estimating the duration of user behavior is a central concern for most internet companies. Survival analysis is a promising method for analyzing the expected duration of events and usually assumes the same survival function for all subjects and the event will occur in the long run. However, such assumptions are inappropriate when the users behave differently or some events never occur for some users, i.e., the conversion period on web services of the light users with no intention of behaving actively on the service. Especially, if the proportion of inactive users is high, this assumption can lead to undesirable results. To address these challenges, this paper proposes a mixture model that separately addresses active and inactive individuals with a latent variable. First, we define this specific problem setting and show the limitations of conventional survival analysis in addressing this problem. We demonstrate how naturally our Bernoulli-Weibull model can accommodate the challenge. The proposed model was extended further to a Bayesian hierarchical model to incorporate each subject's parameter, offering substantial improvements over conventional, non-hierarchical models in terms of WAIC and WBIC. Second, an experiment and extensive analysis were conducted using real-world data from the Japanese job search website, CareerTrek, offered by BizReach, Inc. In the analysis, some research questions are raised, such as the difference in activation rate and conversion rate between user categories, and how instantaneously the rate of event occurrence changes as time passes. Quantitative answers and interpretations are assigned to them. Furthermore, the model is inferred in a Bayesian manner, which enables us to represent the uncertainty with a credible interval of the parameters and predictive quantities.
Job interviews are a fundamental activity for most corporations to acquire potential candidates, and for job seekers to get well-rewarded and fulfilling career opportunities. In many cases, interviews are conducted in multiple processes such as telephone interviews and several face-to-face interviews. At each stage, candidates are evaluated in various aspects. Among them, grade evaluation, such as a rating on a 1-4 scale, might be used as a reasonable method to evaluate candidates. However, because each evaluation is based on a subjective judgment of interviewers, the aggregated evaluations can be biased because the difference in toughness of interviewers is not examined. Additionally, it is noteworthy that the toughness of interviewers might vary depending on the interview round. As described herein, we propose an analytical framework of simultaneous estimation for both the true potential of candidates and toughness of interviewers' judgment considering job interview rounds, with algorithms to extract unseen knowledge of the true potential of candidates and toughness of interviewers as latent variables through analyzing grade data of job interviews. We apply a Bayesian Hierarchical Ordered Probit Model to the grade data from HRMOS, a cloud-based Applicant Tracking System (ATS) operated by BizReach, Inc., an IT start-up particularly addressing human-resource needs in Japan. Our model successfully quantifies the candidate potential and the interviewers' toughness. An interpretation and applications of the model are given along with a discussion of its place within hiring processes in real-world settings. The parameters are estimated by Markov Chain Monte Carlo (MCMC). A discussion of uncertainty, which is given by the posterior distribution of the parameters, is also provided along with the analysis.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.