Data clustering has been discussed extensively, but almost all known conventional clustering algorithms tend to break down in high dimensional spaces because of the inherent sparsity of the data points. Existing subspace clustering algorithms for handling high-dimensional data focus on numerical dimensions. In this paper, we designed an iterative algorithm called SUBCAD for clustering high dimensional categorical data sets, based on the minimization of an objective function for clustering. We deduced some cluster memberships changing rules using the objective function. We also designed an objective function to determine the subspace associated with each cluster. We proved various properties of this objective function that are essential for us to design a fast algorithm to find the subspace associated with each cluster. Finally, we carried out some experiments to show the effectiveness of the proposed method and the algorithm.
Variable Annuity (VA) products expose insurance companies to considerable risk because of the guarantees they provide to buyers of these products. Managing and hedging these risks requires insurers to find the value of key risk metrics for a large portfolio of VA products. In practice, many companies rely on nested Monte Carlo (MC) simulations to find key risk metrics. MC simulations are computationally demanding, forcing insurance companies to invest hundreds of thousands of dollars in computational infrastructure per year. Moreover, existing academic methodologies are focused on fair valuation of a single VA contract, exploiting ideas in option theory and regression. In most cases, the computational complexity of these methods surpasses the computational requirements of MC simulations. Therefore, academic methodologies cannot scale well to large portfolios of VA contracts. In this paper, we present a framework for valuing such portfolios based on spatial interpolation. We provide a comprehensive study of this framework and compare existing interpolation schemes. Our numerical results show superior performance, in terms of both computational efficiency and accuracy, for these methods compared to nested MC simulations. We also present insights into the challenge of finding an effective interpolation scheme in this framework, and suggest guidelines that help us build a fully automated scheme that is efficient and accurate.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.