High resolution, contemporary data on human population distributions are vital for measuring impacts of population growth, monitoring human-environment interactions and for planning and policy development. Many methods are used to disaggregate census data and predict population densities for finer scale, gridded population data sets. We present a new semi-automated dasymetric modeling approach that incorporates detailed census and ancillary data in a flexible, “Random Forest” estimation technique. We outline the combination of widely available, remotely-sensed and geospatial data that contribute to the modeled dasymetric weights and then use the Random Forest model to generate a gridded prediction of population density at ~100 m spatial resolution. This prediction layer is then used as the weighting surface to perform dasymetric redistribution of the census counts at a country level. As a case study we compare the new algorithm and its products for three countries (Vietnam, Cambodia, and Kenya) with other common gridded population data production methodologies. We discuss the advantages of the new method and increases over the accuracy and flexibility of those previous approaches. Finally, we outline how this algorithm will be extended to provide freely-available gridded population data sets for Africa, Asia and Latin America.
Geographical factors have influenced the distributions and densities of global human population distributions for centuries. Climatic regimes have made some regions more habitable than others, harsh topography has discouraged human settlement, and transport links have encouraged population growth. A better understanding of these types of relationships enables both improved mapping of population distributions today and modelling of future scenarios. However, few comprehensive studies of the relationships between population spatial distributions and the range of drivers and correlates that exist have been undertaken at all, much less at high spatial resolutions, and particularly across the low- and middle-income countries. Here, we quantify the relative importance of multiple types of drivers and covariates in explaining observed population densities across 32 low- and middle-income countries over four continents using machine-learning approaches. We find that, while relationships between population densities and geographical factors show some variation between regions, they are generally remarkably consistent, pointing to universal drivers of human population distribution. Here, we find that a set of geographical features relating to the built environment, ecology and topography consistently explain the majority of variability in population distributions at fine spatial scales across the low- and middle-income regions of the world.
The age group composition of populations varies substantially across continents and within countries, and is linked to levels of development, health status and poverty. The subnational variability in the shape of the population pyramid as well as the respective dependency ratio are reflective of the different levels of development of a country and are drivers for a country’s economic prospects and health burdens. Whether measured as the ratio between those of working age and those young and old who are dependent upon them, or through separate young and old-age metrics, dependency ratios are often highly heterogeneous between and within countries. Assessments of subnational dependency ratio and age structure patterns have been undertaken for specific countries and across high income regions, but to a lesser extent across the low income regions. In the framework of the WorldPop Project, through the assembly of over 100 million records across 6,389 subnational administrative units, subnational dependency ratio and high resolution gridded age/sex group datasets were produced for 87 countries in Africa and Asia.
Interactions between humans, diseases, and the environment take place across a range of temporal and spatial scales, making accurate, contemporary data on human population distributions critical for a variety of disciplines. Methods for disaggregating census data to finer-scale, gridded population density estimates continue to be refined as computational power increases and more detailed census, input, and validation datasets become available. However, the availability of spatially detailed census data still varies widely by country. In this study, we develop quantitative guidelines for choosing regionally-parameterized census count disaggregation models over country-specific models. We examine underlying methodological considerations for improving gridded population datasets for countries with coarser scale census data by investigating regional versus country-specific models used to estimate density surfaces for redistributing census counts. Consideration is given to the spatial resolution of input census data using examples from East Africa and Southeast Asia. Results suggest that for many countries more accurate population maps can be produced by using regionally-parameterized models where more spatially refined data exists than that which is available for the focal country. This study highlights the advancement of statistical toolsets and considerations for underlying data used in generating widely used gridded population data.
Large-scale gridded population datasets are usually produced for the year of input census data using a top-down approach and projected backward and forward in time using national growth rates. Such temporal projections do not include any subnational variation in population distribution trends and ignore changes in geographical covariates such as urban land cover changes. Improved predictions of population distribution changes over time require the use of a limited number of covariates that are time-invariant or temporally explicit. Here we make use of recently released multi-temporal high-resolution global settlement layers, historical census data and latest developments in population distribution modelling methods to reconstruct population distribution changes over 30 years across the Kenyan Coast. We explore the methodological challenges associated with the production of gridded population distribution time-series in data-scarce countries and show that trade-offs have to be found between spatial and temporal resolutions when selecting the best modelling approach. Strategies used to fill data gaps may vary according to the local context and the objective of the study. This work will hopefully serve as a benchmark for future developments of population distribution time-series that are increasingly required for population-at-risk estimations and spatial modelling in various fields.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.