The use of species distribution models’ (SDM) is limited by its performance in terms of accuracy, precision, or the spatial distribution of model errors. Despite the wide acceptance of some standard statistics used to evaluate SDM, there is currently a strong on-going debate as to their use. The “area under the curve” (AUC) is a popular measure used to evaluate SDMs; however, it does not provide complete information about model accuracy. The maximum True Skill Statistic (TSS) is another statistic that is gaining acceptance. However, evaluations of a model’s accuracy solely based on this statistic may also be misleading. We investigate the use of alternative methods to evaluate the performance of SDMs, to objectively compare among different modelling approaches. We evaluate the performance of SDMs fitted to simulated and real data by contrasting model predictions to additional validation datasets. We propose visualising TSS scores over the whole detection threshold range (TSS profile). We show how models with similarly good performance according to AUC, present very different results and may serve to different purposes. Also, a high maximum TSS may not guarantee accurate predictions and should be accompanied by the threshold where the maximum is reached (t*). We observe that the higher t* the better predicted observations correlate with confirmed observations. Also, SDM predictions should be accompanied with the corresponding uncertainty map to avoid misleading conclusions. Too high or too widely spread uncertainty on such maps would question the overall accuracy of the model. Whether the model is intended to detect all potential observation sites (sensitive model) or to accurately predict where confirmed observations could be found (specific model) sets a different performance targets to be achieved by the model. The approach proposed helps to discern which SDM may best suit the intended goals. Furthermore, the TSS profile helps i) to evaluate the overall performance of SDMs and compare among them, ii) to identify the main source of error, and iii) to select a detection threshold depending on the maps intended use.