2018
DOI: 10.3390/ijgi8010013
|View full text |Cite
|
Sign up to set email alerts
|

A Cluster-Based Machine Learning Ensemble Approach for Geospatial Data: Estimation of Health Insurance Status in Missouri

Abstract: Mainstream machine learning approaches to predictive analytics consistently prove their ability to perform well using a variety of datasets, although the task of identifying an optimally-performing machine learning approach for any given dataset becomes much less intuitive. Methods such as ensemble and transformation modeling have been developed to improve upon individual base learners and datasets with large degrees of variance. Despite the increased generalizability and flexibility of ensemble approaches, th… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

1
14
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
6
1

Relationship

1
6

Authors

Journals

citations
Cited by 12 publications
(15 citation statements)
references
References 16 publications
1
14
0
Order By: Relevance
“…The algorithms developed, along with methods for determining which variable, or combination of variables, to use as the clustering variable, were demonstrated on a dataset where the objective was to determine the algorithms' ability to predict the distribution of health insurance status across the state of Missouri using demographic, socioeconomic, access, and relative-location information as independent variable loadings. Results from the research found that using a cluster-based ensemble approach outperformed all other modeling techniques evaluated in the study, including linear and nonlinear base-learning algorithms and three aggregate ensemble techniques [15]. In addition to improved predictive performance, the cluster-based ensemble also provided increased inferential power in assessing relative variable importance.…”
Section: Introductionmentioning
confidence: 94%
See 2 more Smart Citations
“…The algorithms developed, along with methods for determining which variable, or combination of variables, to use as the clustering variable, were demonstrated on a dataset where the objective was to determine the algorithms' ability to predict the distribution of health insurance status across the state of Missouri using demographic, socioeconomic, access, and relative-location information as independent variable loadings. Results from the research found that using a cluster-based ensemble approach outperformed all other modeling techniques evaluated in the study, including linear and nonlinear base-learning algorithms and three aggregate ensemble techniques [15]. In addition to improved predictive performance, the cluster-based ensemble also provided increased inferential power in assessing relative variable importance.…”
Section: Introductionmentioning
confidence: 94%
“…Building on methods from Trivedi et al [23] and previous research of Mueller et al [15] that examined the relationship between cluster analysis as a preprocessing technique and predictive performance associated with various base learner and ensemble machine learning models, this study sought to enhance algorithmic performance of the ensemble methodology for geospatial data through a technique known as synthetic population generation. The algorithms developed, along with methods for determining which variable, or combination of variables, to use as the clustering variable, were demonstrated on a dataset where the objective was to determine the algorithms' ability to predict the distribution of health insurance status across the state of Missouri using demographic, socioeconomic, access, and relative-location information as independent variable loadings.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…It would be tautological to argue that such data would be utilised subsequently to determine the subsequent set of premiums (once a customer was onboard) to be paid by the customers. This could be carried out based upon machine learning (ML) or artificial intelligence (AI) based predictive analytics (Kose et al, 2015; Kuo et al, 2007; Mueller et al, 2019). Once the factor inputs data (a large set of it) was available, ML and AI could help HI managers to ascertain the most appropriate premium values.…”
Section: Futuristic Perspectivementioning
confidence: 99%
“…As an important data mining task, clustering is useful for exploring patterns in GTS by assigning similar data elements into the same cluster and dissimilar elements into different ones [7,8]. As a result, it provides both an overview of data at cluster levels and investigation of details on single clusters [9,10].…”
Section: Introductionmentioning
confidence: 99%