Predicting future outcomes based on knowledge obtained from past observational data is a common application in a wide variety of areas of scientific research. In the present paper, prediction will be focused on various grades of cervical preneoplasia and neoplasia. Statistical tools used for prediction should of course possess predictive accuracy, and preferably meet secondary requirements such as speed, ease of use, and interpretability of the resulting predictive model. A new automated procedure based on an extension (called 'boosting') of regression and classification tree (CART) models is described. The resulting tool is a fast 'off-the-shelf' procedure for classification and regression that is competitive in accuracy with more customized approaches, while being fairly automatic to use (little tuning), and highly robust especially when applied to less than clean data. Additional tools are presented for interpreting and visualizing the results of such multiple additive regression tree (MART) models.
The authors provide a didactic treatment of nonlinear (categorical) principal components analysis (PCA). This method is the nonlinear equivalent of standard PCA and reduces the observed variables to a number of uncorrelated principal components. The most important advantages of nonlinear over linear PCA are that it incorporates nominal and ordinal variables and that it can handle and discover nonlinear relationships between variables. Also, nonlinear PCA can deal with variables at their appropriate measurement level; for example, it can treat Likert-type scales ordinally instead of numerically. Every observed value of a variable can be referred to as a category. While performing PCA, nonlinear PCA converts every category to a numeric value, in accordance with the variable's analysis level, using optimal quantification. The authors discuss how optimal quantification is carried out, what analysis levels are, which decisions have to be made when applying nonlinear PCA, and how the results can be interpreted. The strengths and limitations of the method are discussed. An example applying nonlinear PCA to empirical data using the program CATPCA (J. J. Meulman, W. J. Heiser, & SPSS, 2004) is provided.
In a meta-analysis of 37 studies, the effects of psychoeducational (health education and stress management) programs for coronary heart disease patients were examined. The results suggest that these programs yielded a 34% reduction in cardiac mortality; a 29% reduction in recurrence of myocardial infarction (MI); and significant (p < .025) positive effects on blood pressure, cholesterol, body weight, smoking behavior, physical exercise, and eating habits. No effects of psychoeducational programs were found in regard to coronary bypass surgery, anxiety, or depression. The results also suggest that cardiac rehabilitation programs that were successful on proximal targets (systolic blood pressure, smoking behavior, physical exercise, emotional distress) were more effective on distal targets (cardiac mortality and MI recurrences) than programs without success on proximal targets.
Summary.A new procedure is proposed for clustering attribute value data. When used in conjunction with conventional distance-based clustering algorithms this procedure encourages those algorithms to detect automatically subgroups of objects that preferentially cluster on subsets of the attribute variables rather than on all of them simultaneously. The relevant attribute subsets for each individual cluster can be different and partially (or completely) overlap with those of other clusters. Enhancements for increasing sensitivity for detecting especially low cardinality groups clustering on a small subset of variables are discussed. Applications in different domains, including gene expression arrays, are presented.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.