In obtaining a regression tit to a set of data, ordinary least squares regression depends directly on the parametric model formulated by the researcher. Ifthis model is incorrect, a least squares analysis may be misleading. Altematively, nonparametric regression (kemel or local polynomial regression, for example) has no dependence on an underlying parametric model, but instead depends entirely on the distances between regressor coordinates and the prediction point of interest. This procedure avoids the necessity of a reliable model, but in using no information from the researcher, may Ht to irregular pattems in the data. The proper combination of these two regression procedures can overcome their respective problems. Considered is the situation where the researcher has an idea of which model should explain the behavior of the data, but this model is not adequate throughout the entire range of the data. An extension of partial linear regression and two methods of model robust regression are developed and compared in this context. These methods involve parametric tits to the data and nonparametric tits to either the data or residuals. The two tits are then combined in the most efficient proportions via a mixing parameter. Performance is based on bias and variance considerations.
Epidemiologic studies of pesticides have been subject to important biases arising from exposure misclassification. Although turf applicators are exposed to a variety of pesticides, these exposures have not been well characterized. This paper describes a repeated measures study of 135 TruGreen applicators over three spraying seasons via the collection of 1028 urine samples. These applicators were employed in six cities across the United States. Twenty-four-hour estimates (μg) were calculated for the parent compounds 2,4-D, MCPA, mecoprop, dicamba, and imidacloprid and for the insecticide metabolites MPA and 6-CNA. Descriptive statistics were used to characterize the urinary levels of these pesticides, whereas mixed models were applied to describe the variance apportionment with respect to city, season, individual, and day of sampling. The contributions to the overall variance explained by each of these factors varied considerably by the type of pesticide. The implications for characterizing exposures in these workers within the context of a cohort study are discussed.
Our goal is to find a regression technique that can be used in a small-sample situation with possible model misspecification. The development of a new bandwidth selector allows nonparametric regression (in conjunction with least squares) to be used in this small-sample problem, where nonparametric procedures have previously proven to be inadequate. Considered here are two new semiparametric (model-robust) regression techniques that combine parametric and nonparametric techniques when there is partial information present about the underlying model. A general overview is given of how typical concerns for bandwidth selection in nonparametric regression extend to the model-robust procedures. A new penalized PRESS criterion (with a graphical selection strategy for applications) is developed that overcomes these concerns and is able to maintain the beneficial mean squared error properties of the new model-robust methods. It is shown that this new selector outperforms standard and recently improved bandwidth selectors. Comparisons of the selectors are made via numerous generated data examples and a small simulation study.
Creatinine measurements can be used to standardize urinary pesticide concentrations and to estimate ''completeness'' of urine collections. Published statistical models exist to predict 24-h creatinine, but many were developed assuming independence among observations. Using correlated repeated measurement data collected from an occupational cohort, the objectives were to create a predictive model for 24-h urinary creatinine and to compare the predictive capability of this model to earlier published models. Using a mixed-model methodology, the appropriate covariance structure was identified and utilized to model the measurements. A backwards elimination model building technique applied to the model building data set (110 adult male subjects and 457 creatinine values) yielded a final model that included variables for body mass index (BMI), height, diabetes, allergies, medical conditions that affect kidney function, use of creatine supplements, and anti-inflammatory medications. Using an external model validation data set (21 adult male subjects' creatinine values, n ¼ 91 observations from a total of 275) the predictive performance of the model was evaluated using the mean square prediction error (MSPR) and the Pearson's correlation coefficient (r); its performance was better (MSPR ¼ 279184, r ¼ 0.43) than any of the earlier models investigated (MSPR: range 658860-393139; r, range 0.18-0.38). In conclusion, the use of a covariance structure that allowed repeated measurements for any one individual to be correlated, improved the predictive performance. For purposes of incomplete urine sample identification in observational studies, it is necessary to collect information in addition to age, gender, and BMI, which are typically used in these settings.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.