The association of species distribution and environmental variables is often complex, with nonlinear and interacting effects. Bayesian hierarchical models can quantify linear and high order effects of the variables, as well as their interaction effects. Their strength is to take into account the uncertainties in observation, models, and parameters. However, the model selection process is usually time-consuming, especially when the number of environmental variables is large. Random forest, an efficient machine learning algorithm, can rank the environmental variables so as to facilitate the model selection process. We analyzed the nest site selection of the crested ibis (Nipponia nippon) at watersheds in its distribution range. The crested ibis has attracted much attention in the past 30 years due to its extremely low population level, and now it has recovered to over 1,000 individuals in the wild. We built Bayesian hierarchical models to quantify the association between the number of nests in 95 watersheds and nine environmental variables of these watersheds.We applied random forest to check the effect of every variable and removed the unimportant variables from the hierarchical models. Unlike our previous studies, we found that the interaction between the area of rice paddy and the area of water bodies (i.e. rivers, lakes and ponds) had most contribution to the nest site selection, whereas the linear terms of either rice paddy or water body had little effects. The detection probability of the nests during the surveys was inversely associated with elevation and the standard deviation of elevation (i.e. roughness of the landscape) in the watershed. Our models provide the insight that the crested ibis need both rice paddies and water bodies in their annual life cycle. Habitat protection practices should cover not only rice paddies, but also water bodies to ensure long term survival of this endangered bird.
18When the number of environmental variables is large, the model selection process of
19Bayesian hierarchical models would be very slow (Beguin et al. 2012).
20Random forest is an ensemble machine learning method for classification and 21 regression that operates by constructing a multitude of decision trees (Breiman 2001a).
22Decision trees tend to learn highly irregular patterns, i.e. they overfit their training
48In this study, we used hirarchical models (composing of several GLMs) to study 49 the habitat use of the crested ibis (Nipponia nippon) at a regional scale during both the 50 breeding season and the post-breeding season. Such anaysis at a large spatial scale 51 enable us to examine the overall habitat preference in its life cycle.
52The crested ibis was listed as critically endangered (BirdLife International 2001). The crested ibis is an ideal species for studying the integrative habitat preference
70
METHODS
71
Study area
72Recovered from a few remnant individuals, the wild populations of crested ibis Table 1.
93The crested ibis occurrence layer consists of 941 records, which are the nest ...