“…Models trained with larger presence datasets tend to perform better due to improved sampling of environmental tolerances of species and reduced sampling bias (Araújo & Guisan, 2006; Stockwell & Peterson, 2002). Occurrence data are rapidly being digitized and disseminated online (Reginato & Michelangeli, 2020; Zurell, Franklin, et al, 2020; Petersen et al, 2021), but the records are available for a small percentage of existing specimens, which may themselves represent a fraction of global plant diversity (Marsico et al, 2020). Additionally, collections are often reduced to plants that are easy to acquire due to their accessible locations (Elith & Leathwick, 2009; Kadmon et al, 2004) whereas factors such as difficulty of handling and preserving cacti specimens (Baker et al, 1985; Fosberg, 1932) or narrow endemism and rarity (Ferrier & Guisan, 2006; Papeş & Gaubert, 2007) may limit collecting efforts.…”