Plant identification is critical for a wide range of biological fields and goals, ranging from understanding ecological processes, such as community assembly, to the conservation of rare and threatened species (Thessen, 2016). Historically, species have been identified using trait-based approaches in the form of dichotomous and polyclave keys (Tilling, 1984; Edwards et al., 1987). These identification keys remain an important and widely used resource for scientists (Gaylard and Kerley, 1995; Randler, 2008), as they are convenient, inexpensive, and enable identification when tissue samples cannot be collected for molecular barcoding (Will and Rubinoff, 2004). Improving trait-based plant identification (e.g., reducing the number of traits required for identification) could be especially useful for improving the efficacy of citizen scientists in large-scale projects where the use of genetic tools is not feasible or cost-effective (Gallo and Waitt, 2011; Roy et al., 2016). Advancements in computational methods such as machine learning, in tandem with the recent rise of online, easily accessible "big data, " could provide an unprecedented opportunity to improve traitbased identification, just as it has proved useful in other important ecological areas. For instance, machine learning has been applied to large databases to predict phenomena such as global surface temperatures (Casaioli et al., 2003), and underpins some of the most