Sparse Multinomial Logistic Regression (SMLR) is widely used in the field of image classification, multi-class object recognition, and so on, because it has the function of embedding feature selection during classification. However, it cannot meet the time and memory requirements for processing large-scale data. We have reinvestigated the classification accuracy and running efficiency of the algorithm for solving SMLR problems using the Alternating Direction Method of Multipliers (ADMM), which is called fast SMLR (FSMLR) algorithm in this paper. By reformulating the optimization problem of FSMLR, we transform the serial convex optimization problem to the distributed convex optimization problem, i.e., global consensus problem and sharing problem. Based on the distributed optimization problem, we propose two distribute parallel SMLR algorithms, sample partitioning-based distributed SMLR (SP-SMLR), and feature partitioning-based distributed SMLR (FP-SMLR), for a large-scale sample and large-scale feature datasets in big data scenario, respectively. The experimental results show that the FSMLR algorithm has higher accuracy than the original SMLR algorithm. The big data experiments show that our distributed parallel SMLR algorithms can scale for massive samples and large-scale features, with high precision. In a word, our proposed serial and distribute SMLR algorithms outperform the state-of-the-art algorithms.INDEX TERMS Alternating Direction Method of Multipliers, big data, distributed parallel, sparse multinomial logistic regression.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.