Abstract-Attribute or feature selection is one of the basic strategies to improve the performances of data classification tasks, and, at the same time to reduce the complexity of classifiers, and it is a particularly fundamental one when the number of attributes is relatively high. Evolutionary computation has already proven itself to be a very effective choice to consistently reduce the number of attributes towards a better classification rate and a simpler semantic interpretation of the inferred classifiers. We propose the application of the multi-objective evolutionary algorithm ENORA to the task of feature selection for multi-class classification of data extracted from an integrated multi-channel multi-skill contact center, which include technical, service and central data for each session. Additionally, we propose a methodology to integrate feature selection for classification, model evaluation, and decision making to choose the most satisfactory model according to a a posteriori process in a multi-objective context. We check out our results by comparing the performance and the classification rate against the well-known multi-objective evolutionary algorithm NSGA-II. Finally, the best obtained solution is validated by a data expert's semantic interpretation of the classifier.
Temporal information plays a very important role in many analysis tasks, and can be encoded in at least two different ways. It can be modeled by discrete sequences of events as, for example, in the business intelligence domain, with the aim of tracking the evolution of customer behaviors over time. Alternatively, it can be represented by time series, as in the stock market to characterize price histories. In some analysis tasks, temporal information is complemented by other kinds of data, which may be represented by static attributes, e.g., categorical or numerical ones. This paper presents J48SS, a novel decision tree inducer capable of natively mixing static (i.e., numerical and categorical), sequential, and time series data for classification purposes. The novel algorithm is based on the popular C4.5 decision tree learner, and it relies on the concepts of frequent pattern extraction and time series shapelet generation. The algorithm is evaluated on a text classification task in a real business setting, as well as on a selection of public UCR time series datasets. Results show that it is capable of providing competitive classification performances, while generating highly interpretable models and effectively reducing the data preparation effort.
The interpretability of classification systems refers to the ability of these to express their behaviour in a way that is easily understandable by a user. Interpretable classification models allow for external validation by an expert and, in certain disciplines such as medicine or business, providing information about decision making is essential for ethical and human reasons. Fuzzy rule-based classification systems are consolidated powerful classification tools based on fuzzy logic and designed to produce interpretable models; however, in presence of a large number of attributes, even rule-based models tend to be too complex to be easily interpreted. In this work, we propose a novel multivariate feature selection method in which both search strategy and classifier are based on multi-objective evolutionary computation. We designed a set of experiments to establish an acceptable setting with respect to the number of evaluations required by the search strategy and by the classifier, and we tested our strategy on a real-life dataset. Then, we compared our results against a wide range of feature selection methods that includes filter, wrapper, multivariate and univariate methods, with deterministic and probabilistic search strategies, and with evaluators of diverse nature. Finally, the fuzzy rule-based classification model obtained with the proposed method has been evaluated with standard performance metrics and compared with other wellknown fuzzy rule-based classifiers. We have used two real-life datasets extracted from a contact center; in one case, with the proposed method we obtained an accuracy of 0.7857 with 8 rules, while the best fuzzy classifier compared obtained 0.7679 with 8 rules, and in the second case, we obtained an accuracy of 0.7403 with 5 rules, while the best fuzzy classifier compared obtained 0.6364 with 4 rules.
Multi-channel contact centers are an increasingly important component of today's business world. They serve as a primary customer-facing channel for firms in many different industries, and employ millions of operators across the globe. During their operation, they generate vast amounts of data, ranging from automatically registered logs to handwritten notes and voice recordings. Unfortunately, in most firms, data of interest is unstructured, and stored in several databases, making their exploitation very hard. This article presents a decision support system for a multi-channel, multi-service contact center for front office business process outsourcing, along with its prospective extension to a decision management system. Its core is an enterprise-wide data warehouse, based on the general concept of an event. The proposed system supports a broad new set of advanced analysis tasks, ranging from operator performance assessment to call-flow simulation and data mining, providing operational and management staff the basis for taking effective operative and strategic decisions.
Multi-channel contact centers are an increasingly important component of today's business world. They serve as a primary customer-facing channel for firms in many different industries, and employ millions of operators across the globe. During their operation, they generate vast amounts of data, ranging from automatically registered logs to handwritten notes and voice recordings. Unfortunately, in most firms, data of interest is unstructured, and stored in several databases, making their exploitation very hard. This article presents a decision support system for a multi-channel, multi-service contact center for front office business process outsourcing, along with its prospective extension to a decision management system. Its core is an enterprise-wide data warehouse, based on the general concept of an event. The proposed system supports a broad new set of advanced analysis tasks, ranging from operator performance assessment to call-flow simulation and data mining, providing operational and management staff the basis for taking effective operative and strategic decisions.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.