Other components of variability related to the spatial alignment between spectra and ground references are errors in plot coordinates, upscale or downscale, distortions to departing from the nadir, among others (Manolakis et al., 2003). The discretisation of continuous domains such as spectra, space or time results in the loss of a certain amount of information (Bruce et al., 2002). The spatial resolution of a remote sensing data (pixel) or the sample The possible solutions for selecting covariates for modelling using hyperspectral data and avoid multicollinearity include: (1) extracting spectral indices that explain causally or empirically the relationship with the target plant trait based on a-priori knowledge; (2) searching a coefficient from a combination of two or more bands that is highest correlated with the plant trait (Darvishzadeh et al., 2008); (3) combining wavelengths to create latent Predictive models for plant traits are mostly selected by data rather than based on theory, and often elected among different regression techniques (James et al., 2013). If the model is assessed with the same data as was fitted, more complexity, directly means more accuracy, as the prediction error always reduces when the complexity increases (James et al., 2013). Consequently, it is improper to assess and report the accuracy of predictive models with the same data as used for selecting the final model. Predictive models require to split the data into training and testing (sub) sets to assess accuracy (Esbensen and Geladi, 2010). There are many alternatives, from splitting an independent This thesis is comprised of six chapters, of which four research chapters are submitted, and three are currently accepted as scientific articles to peerreviewed ISI journals. The general outline is indicated below. Chapter 1: the introductory chapter discusses the importance of plant traits and the role of remote sensing to monitoring and understanding the underlying process. The chapter is designed to highlight issues that need further improvement when modelling plant traits with hyperspectral data. Chapter 2: demonstrates that empirical models using hyperspectral data to predict traits are very likely to lead to significant overfitting, even when selected by commonly used robust cross-validation. A new method named Naïve Overfitting Index Selection (NOIS) was developed to quantify overfitting while selecting model complexity (tuning). The method was tested using five hyperspectral datasets and seven machine learning regression techniques. Chapter 3: shows that machine learning regressions using hyperspectral data are likely to lead to inaccurate predictions when significant autocorrelation is