Key Words sample design, statistical inference, sample weights, analysis of data from complex samples s Abstract The increased use of rigorous population-sampling methods and the analysis of data from those samples in cross-sectional surveys, case-control studies, longitudinal-cohort investigations, and other epidemiological research efforts have raised important statistical issues for health analysts. We describe the origin, implications, and some plausible resolutions for several of these issues. Some of the main issues we consider include (a) establishing whom the sample represents; (b) using sample weights; (c) understanding the role of other important features, such as the use of sampling stratification and the selection of clustered groups of population members; and (d ) finding ways to analyze study data with key sampling features in mind. Ultimately, resolution of all of these issues requires that analysts clearly define a reference population and then understand the role of design features in relating sample results to that population.
POPULATION SAMPLING IN EPIDEMIOLOGYHistorically, most empirical knowledge has been based on incomplete observation and therefore incomplete samplings of the human experience (46). In each such case there has been a population (i.e. some collection of persons or objects about which knowledge was sought) and a sample (i.e. a portion of the population to be observed, thus providing an informational basis for acquiring knowledge about the population).The need for sampling in epidemiological research stems from the nature of several overlapping research designs that are commonly used in the field (17,39,45). For each of these designs, when it is impractical to examine the entire population, statements are made about a targeted group of individuals called the study population, based on observations obtained from a representative portion of the population.