To design an efficient survey or monitoring program for a natural resource it is important to consider the spatial distribution of the resource. Generally, sample designs that are spatially balanced are more efficient than designs which are not. A spatially balanced design selects a sample that is evenly distributed over the extent of the resource. In this article we present a new spatially balanced design that can be used to select a sample from discrete and continuous populations in multi-dimensional space. The design, which we call balanced acceptance sampling, utilizes the Halton sequence to assure spatial diversity of selected locations. Targeted inclusion probabilities are achieved by acceptance sampling. The BAS design is conceptually simpler than competing spatially balanced designs, executes faster, and achieves better spatial balance as measured by a number of quantities. The algorithm has been programed in an R package freely available for download.
Decision trees are a popular technique in statistical data classification. They recursively partition the feature space into disjoint sub-regions until each sub-region becomes homogeneous with respect to a particular class. The basic Classification and Regression Tree (CART) algorithm partitions the feature space using axis parallel splits. When the true decision boundaries are not aligned with the feature axes, this approach can produce a complicated boundary structure. Oblique decision trees use oblique decision boundaries to potentially simplify the boundary structure. The major limitation of this approach is that the tree induction algorithm is computationally expensive. In this article we present a new decision tree algorithm, called HHCART. The method utilizes a series of Householder matrices to reflect the training data at each node during the tree construction. Each reflection is based on the directions of the eigenvectors from each classes' covariance matrix. Considering axis parallel splits in the reflected training data provides an efficient way of finding oblique splits in the unreflected training data. Experimental results show that the accuracy and size of the HHCART trees are comparable with some benchmark methods in the literature. The appealing feature of HHCART is that it can handle both qualitative and quantitative features in the same oblique split.
Some environmental studies use non-probabilistic sampling designs to draw samples from spatially distributed populations. Unfortunately, these samples can be difficult to analyse statistically and can give biased estimates of population characteristics. Spatially balanced sampling designs are probabilistic designs that spread the sampling effort evenly over the resource. These designs are particularly useful for environmental sampling because they produce good-sample coverage over the resource, they have precise design-based estimators and they can potentially reduce the sampling cost. The most popular spatially balanced design is Generalized Random Tessellation Stratified (GRTS), which has many desirable features including a spatially balanced sample, design-based estimators and the ability to select spatially balanced oversamples. This article considers the popularity of spatially balanced sampling, reviews several spatially balanced sampling designs and shows how these designs can be implemented in the statistical programming language R. We hope to increase the visibility of spatially balanced sampling and encourage environmental scientists to use these designs.
Land reclamation associated with natural gas development has become increasingly important to mitigate land surface disturbance in western North America. Since well pads occur on sites with multiple land use and ownership, the progress and outcomes of these efforts are of interest to multiple stakeholders including industry, practitioners and consultants, regulatory agents, private landowners, and the scientific community. Reclamation success criteria often vary within, and among, government agencies and across land ownership type. Typically, reclamation success of a well pad is judged by comparing vegetation cover from a single transect on the pad to a single transect in an adjacent reference site and data are collected by a large number of technicians with various field monitoring skills. We utilized “SamplePoint” image analysis software and a spatially balanced sampling design, called balanced acceptance sampling, to demonstrate how spatially explicit quantitative data can be used to determine if sites are meeting various reclamation success criteria and used chi‐square tests to show how sites in vegetation percent cover differ from a statistical standpoint. This method collects field data faster than traditional methods. We demonstrate how quantitative and spatially explicit data can be utilized by multiple stakeholders, how it can improve upon current reference site selection, how it can satisfy reclamation monitoring requirements for multiple regulatory agencies, how it may help improve future seed mix selection, and discuss how it may reduce costs for operations responsible for reclamation and how it may reduce observer bias.
A random search algorithm for unconstrained local nonsmooth optimization is described. The algorithm forms a partition on R n using classification and regression trees (CART) from statistical pattern recognition. The CART partition defines desirable subsets where the objective function f is relatively low, based on previous sampling, from which further samples are drawn directly. Alternating between partition and sampling phases provides an effective method for nonsmooth optimization. The sequence of iterates {z k } is shown to converge to an essential local minimizer of f with probability one under mild conditions. Numerical results are presented to show that the method is effective and competitive in practice.
This article presents a modification of balanced acceptance sampling (BAS) that causes inclusion probabilities to better approximate targeted inclusion probabilities. A new sample frame constructor for BAS is also introduced from which equi-probable spatially balanced samples are drawn.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.