Geographically weighted regression (GWR) introduces the distance weighted kernel function to examine the non-stationarity of geographical phenomena and improve the performance of global regression. However, GWR calibration becomes critical when using a serial computing mode to process large volumes of data. To address this problem, an improved approach based on the compute unified device architecture (CUDA) parallel architecture fast-parallel-GWR (FPGWR) is proposed in this paper to efficiently handle the computational demands of performing GWR over millions of data points. FPGWR is capable of decomposing the serial process into parallel atomic modules and optimizing the memory usage. To verify the computing capability of FPGWR, we designed simulation datasets and performed corresponding testing experiments. We also compared the performance of FPGWR and other GWR software packages using open datasets. The results show that the runtime of FPGWR is negatively correlated with the CUDA core number, and the calculation efficiency of FPGWR achieves a rate of thousands or even tens of thousands times faster than the traditional GWR algorithms. FPGWR provides an effective tool for exploring spatial heterogeneity for large-scale geographic data (geodata).
Geographically weighted regression (GWR) is a classical method for estimating nonstationary relationships. Notwithstanding the great potential of the model for processing geographic data, its large-scale application still faces the challenge of high computational costs. To solve this problem, we proposed a computationally efficient GWR method, called K-Nearest Neighbors Geographically weighted regression (KNN-GWR). First, it utilizes a k-dimensional tree (KD tree) strategy to improve the speed of finding observations around the regression points, and, to optimize the memory complexity, the submatrices of neighbors are extracted from the matrix of the sample dataset. Next, the optimal bandwidth is found by referring to the spatial clustering relationship explained by K-means. Finally, the performance and accuracy of the proposed KNN-GWR method was evaluated using a simulated dataset and a Chinese house price dataset. The results demonstrated that the KNN-GWR method achieved computational efficiency thousands of times faster than existing GWR algorithms, while ensuring accuracy and significantly improving memory optimization. To the best of our knowledge, this method was able to run hundreds of thousands or millions of data on a standard computer, which can inform improvement in the efficiency of local regression models.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.