“…Selection of the number of factors can influence the model results strongly, so care was taken to find a method for selecting the most appropriate number for our models. Methods tested include the root mean squared error of cross validation (RMSECV; Naes et al, 2002), selection of a local minimum within the RMSECV surface (Li et al, 2002), modified RMSECV (van der Voet, 1994), root mean squared error of calibration (RMSEC), root mean squared error of prediction (RMSEP), a custom bias/error inclusion metric from Takahama and coworkers (Takahama and Dillner, 2015), explained variance in the test set of standard functional group moles, and simulated annealing (SA; Ledesma et al, 2012). Ultimately, the number of PLS factors should be selected using the minimum RMSECV, so that the values were allowed to vary in each model version.…”