-In this paper we introduce the supermarket setout problem which aims at finding an appropriate setout of products (i.e. which product should be placed on which position at which shelf) w.r.t. the expected profit, access time (time required to find a product), etc. Finding a good setout of products is important both in the customer-area and in the background store of supermarkets. The admissible setouts are constrained by laws (like food is not allowed to be placed next to chemicals) and "setout traditions" (even if two products are allowed to be placed next to each other by law, it may look "strange"). After taking all such requirements into account, there are usually still a great number of possible setouts that are generally greatly different from various aspects, like expected profit, customer satisfaction, or the time required to find a product. Therefore, finding an appropriate (closely optimal) solution is a crucial issue. Complex business problems, like the above supermarket setout problem, are often solved by means of artificial intelligence techniques that exploit results of advanced statistical analysis or data mining. In this paper, we develop a new algorithm for the supermarket setout problem. This is based on a combination of constraint satisfaction algorithms and frequent itemset mining (also known as market basket analysis).
One of the most prominent challenges in data mining is the clustering of databases containing many categorical attributes. Representation of such data in continuous, Euclidean space usually does not reflect the true segments of data. As a crucial consequence, clustering algorithms working in continuous, Euclidean space may produce segmentations of poor quality. An alternative direction explores graph-based representation of data. In this paper, we show that graph-based data representation is well suitable for the case of categorical attributes. In particular, we offer the following contributions: i) we propose and analyze a flexible graph-based genetic clustering algorithm, where the ideal clusters can be characterized using external cluster quality functions, called kernels, ii) we study kernels, and define the crucial property of effective kernels, iii) we introduce a framework for distributed data-oriented graph clustering computations. In contrast of the complexity of our problem, which turns out to be NP-hard in our analysis, experiments show that in case of well clusterable data, our algorithm has attractive scalability properties. We also perform experiments on real medical data that provides us with further evidence about the practical applicability of our approach.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.