The two most commonly used penalized model selection criteria, the Bayesian information criterion (BIC) and Akaike’s information criterion (AIC), are examined and compared. Their motivations as approximations of two different target quantities are discussed, and their performance in estimating those quantities is assessed. Despite their different foundations, some similarities between the two statistics can be observed, for example, in analogous interpretations of their penalty terms. The behavior of the criteria in selecting good models for observed data is examined with simulated data and also illustrated with the analysis of two well-known data sets on social mobility. It is argued that useful information for model selection can be obtained from using AIC and BIC together, particularly from trying as far as possible to find models favored by both criteria.
The question of whether and how ethnic diversity affects the social cohesion of communities has become an increasingly prominent and contested topic of academic and political debate. In this paper we focus on a single city: London. As possibly the most ethnically diverse conurbation on the planet, London serves as a particularly suitable test-bed for theories about the effects of ethnic heterogeneity on prosocial attitudes. We find neighbourhood ethnic diversity in London to be positively related to the perceived social cohesion of neighbourhood residents, once the level of economic deprivation is accounted for. Ethnic segregation within neighbourhoods, on the other hand, is associated with lower levels of perceived social cohesion. Both effects are strongly moderated by the age of individual residents: diversity has a positive effect on social cohesion for young people but this effect dissipates in older age groups; the reverse pattern is found for ethnic segregation.
Social mobility is now a matter of greater political concern in Britain than at any time previously. However, the data available for the determination of mobility trends are less adequate today than two or three decades ago. It is widely believed in political and in media circles that social mobility is in decline. But the evidence so far available from sociological research, focused on intergenerational class mobility, is not supportive of this view. We present results based on a newly-constructed dataset covering four birth cohorts that provides improved data for the study of trends in class mobility and that also allows analyses to move from the twentieth into the twenty-first century. These results confirm that there has been no decline in mobility, whether considered in absolute or relative terms. In the case of women, there is in fact evidence of mobility increasing. However, the better quality and extended range of our data enable us to identify other 'mobility problems' than the supposed decline. Among the members of successive cohorts, the experience of absolute upward mobility is becoming less common and that of absolute downward mobility more common; and class-linked inequalities in relative chances of mobility and immobility appear wider than previously thought.
We consider models which combine latent class measurement models for categorical latent variables with structural regression models for the relationships between the latent classes and observed explanatory and response variables. We propose a two-step method of estimating such models. In its first step, the measurement model is estimated alone, and in the second step the parameters of this measurement model are held fixed when the structural model is estimated. Simulation studies and applied examples suggest that the two-step method is an attractive alternative to existing one-step and three-step methods. We derive estimated standard errors for the two-step estimates of the structural model which account for the uncertainty from both steps of the estimation, and show how the method can be implemented in existing software for latent variable modelling.
Two correction methods are considered for multiple logistic regression models with some covariates measured with error. Both methods are based on approximating the complicated regression model between the response and the observed covariates with simpler models. The first model is the logistic approximation proposed by Rosner et al., and the second is a second-order extension of this model. Only the mean and covariance matrix of the true values of the covariates given the observed values have to be specified, but no distributional assumptions about the measurement error are made. The parameters related to the conditional moments are estimated from a separate validation data set. The correction methods considered here are compared to other methods proposed in the literature. They are also applied to a multiple logistic model describing the effect of nutrient intakes on the ratio of serum HDL cholesterol. The data constitute baseline data from an epidemiological cohort study, in which a separate pilot study has been carried out to obtain validation information. In the example the corrected parameter estimates from the two approximate models are very similar. Both differ considerably from the naive logistic estimates, indicating a large effect of the measurement error. The various assumptions required by the correction methods are also discussed.
It is widely believed that regression models for binary responses are problematic if we want to compare estimated coefficients from models for different groups or with different explanatory variables. This concern has two forms. The first arises if the binary model is treated as an estimate of a model for an unobserved continuous response, and the second when models are compared between groups which have different distributions of other causes of the binary response. We argue that these concerns are usually misplaced. The first of them is only relevant if the unobserved continuous response is really the subject of substantive interest. If it is, the problem should be addressed through better measurement of this response. The second concern refers to a situation which is unavoidable but unproblematic, in that causal effects and descriptive associations are inherently group-dependent and can be compared as long as they are correctly estimated.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.