Open card sorting is a well-established method for discovering how people understand and categorize information. This paper addresses the problem of quantitatively analyzing open card sorting data using the K-means algorithm. Although the K-means algorithm is effective, its results are too sensitive to initial category centers. Therefore, many approaches in the literature have focused on determining suitable initial centers. However, this is not always possible, especially when the number of categories is increased. This paper proposes an approach to improve the quality of the solution produced by the K-means for open card sort data analysis. Results show that the proposed initialization approach for K-means outperforms existing initialization methods, such as MaxMin, random initialization and K-means++. The proposed algorithm is applied to a real-world open card sorting dataset, and, unlike existing solutions in the literature, it can be used with any number of participants and cards.
Open card sorting is a widely used method in HCI for the design of user-centered Information Architectures (IAs). This article proposes a new algorithm that combines the best merge method (BMM), category validity technique (CVT), and multidimensional scaling (MDS) to explore, analyze and visualize open card sort data. A study involving 20 participants and 41 cards explored the IA redesign of a university's website. The collected data were analyzed using two popular methods employed in the quantitative analysis of open card sort data (i.e., hierarchical clustering, K-means) and the proposed algorithm. It was found that the latter provides increased IA insights compared to the existing methods. Specifically, the proposed algorithm can expose hidden patterns and relationships amongst cards and identify complexities. We also found that the proposed algorithm produces better initial clusters, which have a direct effect on the final clustering quality.
Cluster analysis of real-life data often encounters the challenges of noisy data or may rely heavily on the uncertainty of the main clustering variable owing to its stochastic nature, which has a potential influence on its performance. In this study, we propose a novel clustering technique that is efficient in dealing with noise and uncertainty in a dataset by adopting a stochastic approach that uses realistic values of data points by assuming a continuous probability distribution instead of exact values. By estimating the best-fit probability distribution of the clustering variable, the proposed method formulates the problem of determining the most homogeneous clusters by determining the optimum cluster partitions (OCP) as a mathematical programming problem (MPP). A computer-intensive dynamic programming technique was used to solve the MPP and determine the OCP, which minimized the sum of the weighted intracluster standard deviations. The proposed technique is then demonstrated in this study using univariate data that follows a normal distribution, which is a symmetric distribution, as well as the Weibull distribution, which is a skewed distribution. Numerical examples were also presented to illustrate the computational details of the proposed method. Finally, using both simulated and real datasets, a comparative analysis of the effectiveness of the proposed technique was performed against four advanced clustering methods: k-means, fuzzy c-means, expectation-maximization, and Genie++ hierarchical clustering. The results reveal that the proposed method works well and produces more efficient clusters than other methods.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.