Stratified sampling is a technique that consists in separating the elements of a population into nonoverlapping groups, called strata. This paper describes a new algorithm to solve the one-dimensional case, which reduces the stratification problem to just determining strata boundaries. Assuming that the number L of strata and the total sample size n are predetermined, we obtain the strata boundaries by taking into consideration an objective function associated with the variance. In order to solve this problem, we have implemented an algorithm based on the iterative local search metaheuristic. Computational results obtained from a real data set are presented and discussed.
The problem of finding an optimal sample stratification has been extensively studied in the literature. In this paper, we propose a heuristic optimization method for solving the univariate optimum stratification problem aiming at minimizing the sample size for a given precision level. The method is based on the variable neighborhood search metaheuristic, which was combined with an exact method. Numerical experiments were performed over a dataset of 24 instances, and the results of the proposed algorithm were compared with two very well-known methods from the literature. Our results outperformed $94\%$ of the considered cases. In addition, we developed an enumeration algorithm to find the global optimal solution in some populations and scenarios, which enabled us to validate our metaheuristic method. Furthermore, we find that our algorithm obtained the global optimal solutions for the vast majority of the cases.
ResumoA análise de agrupamentos agrega vários métodos que visam identificar grupos dentro de um conjunto de dados. Este artigo apresenta novas heurísticas baseadas na metaheurística Busca Local Iterada para resolver o Problema de Agrupamento Automático, qual seja o problema de determinar o número ideal de grupos para uma base dados. Para tal, em uma das fases da aplicação desta heurística, foi utilizado o índice silhueta, que combina conceitos de coesão e separação e é considerado pelas heurísticas propostas para avaliar a qualidade das soluções. De acordo com os experimentos computacionais reportados neste trabalho, verifica-se que a nova heurística ILS-DBSCAN é muito eficiente no que concerne ao tempo de processamento e muito eficaz quanto à qualidade das soluções obtidas, quando comparado com outros métodos da literatura. Em geral, os resultados desta nova heurística foram superiores aos resultados relatados na literatura. Dessa maneira o ILS-DBSCAN apresenta-se como um algoritmo promissor para a resolução do problema abordado.
Palavras-chave:
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.