This paper reviews the main applications of geostatistics to the description and modeling of the spatial variability of microbiological and physico-chemical soil properties. First, basic geostatistical tools such as the correlogram and semivariogram are introduced to characterize the spatial variability of each attribute separately as well as their spatial interactions. Then, the key issue of fitting permissible models to experimental semivariograms is addressed for the univariate and multivariate situations. Capitalizing on this model of spatial dependence, the value of a soil property can be predicted at unsampled locations using only observations of this particular property (kriging) or incorporating additional information provided by other correlated properties (cokriging). Factorial kriging allows one to discriminate the different sources of spatial variation in soil on the basis of the scale at which they operate, and it often enhances relations between soil attributes which were blurred in a traditional correlation analysis where the different sources of variations are mixed. Geostatistics can also be used to assess the risk of exceeding critical values (regulatory thresholds, soil quality criterion) at unsampled locations, and to simulate the spatial distribution of attribute values. All the different tools are illustrated using two transects of 100 pH and electrical conductivity values measured in pasture and forest.
The use of surrogate models or metamodeling has lead to new areas of research in simulation-based design optimization. Metamodeling approaches have advantages over traditional techniques when dealing with the noisy responses and=or high computational cost characteristic of many computer simulations. This paper focuses on a particular algorithm, Efficient Global Optimization (EGO) that uses kriging metamodels. Several infill sampling criteria are reviewed, namely criteria for selecting design points at which the true functions are evaluated. The infill sampling criterion has a strong influence on how efficiently and accurately EGO locates the optimum. Variance-reducing criteria substantially reduce the RMS error of the resulting metamodels, while other criteria influence how locally or globally EGO searches. Criteria that place more emphasis on global searching require more iterations to locate optima and do so less accurately than criteria emphasizing local search.
This paper presents a methodology to conduct geostatistical variography and interpolation on areal data measured over geographical units (or blocks) with different sizes and shapes, while accounting for heterogeneous weight or kernel functions within those units. The deconvolution method is iterative and seeks the pointsupport model that minimizes the difference between the theoretically regularized semivariogram model and the model fitted to areal data. This model is then used in areato-point (ATP) kriging to map the spatial distribution of the attribute of interest within each geographical unit. The coherence constraint ensures that the weighted average of kriged estimates equals the areal datum. This approach is illustrated using health data (cancer rates aggregated at the county level) and population density surface as a kernel function. Simulations are conducted over two regions with contrasting county geographies: the state of Indiana and four states in the Western United States. In both regions, the deconvolution approach yields a point support semivariogram model that is reasonably close to the semivariogram of simulated point values. The use of this model in ATP kriging yields a more accurate prediction than a naïve point kriging of areal data that simply collapses each county into its geographic centroid. ATP kriging reduces the smoothing effect and is robust with respect to small differences in the point support semivariogram model. Important features of the point-support semivariogram, such as the nugget effect, can never be fully validated from areal data. The user may want to narrow down the set of solutions based on his knowledge of the phenomenon (e.g., set the nugget effect to zero). The approach presented avoids the visual bias associated with the interpretation of choropleth maps and should facilitate the analysis of relationships between variables measured over different spatial supports.
[1] During the last decade one has witnessed an increasing interest in assessing health risks caused by exposure to contaminants present in the soil, air, and water. A key component of any exposure study is a reliable model for the space-time distribution of pollutants. This paper compares the performances of multi-Gaussian and indicator kriging for modeling probabilistically the spatial distribution of arsenic concentrations in groundwater of southeast Michigan, accounting for arsenic data collected at private residential wells and the hydrogeochemistry of the area. The arsenic data set, which was provided by the Michigan Department of Environmental Quality (MDEQ), includes measurements collected between 1993 and 2002 at 8212 different wells. Factorial kriging was used to filter the short-range spatial variability in arsenic concentration, leading to a significant increase (17-65%) in the proportion of variance explained by secondary information, such as type of unconsolidated deposits and proximity to Marshall Sandstone subcrop. Cross validation of well data shows that accounting for this regional background does not improve the local prediction of arsenic, which reveals the presence of unexplained sources of variability and the importance of modeling the uncertainty attached to these predictions. Slightly more precise models of uncertainty were obtained using indicator kriging. Well data collected in 2004 were compared to the prediction model and best results were found for soft indicator kriging which has a mean absolute error of 5.6 mg/L. Although this error is large with respect to the USEPA standard of 10 mg/L, it is smaller than the average difference (12.53 mg/L) between data collected at the same well and day, as reported in the MDEQ data set. Thus the uncertainty attached to the sampled values themselves, which arises from laboratory errors and lack of information regarding the sample origin, contributes to the poor accuracy of the geostatistical predictions in southeast Michigan.Citation: Goovaerts, P., G. AvRuskin, J. Meliker, M. Slotnick, G. Jacquez, and J. Nriagu (2005), Geostatistical modeling of the spatial variability of arsenic in groundwater of southeast Michigan, Water Resour. Res., 41, W07013,
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.