Titouan Lorieul scite author profile

Premise of the Study Phenological annotation models computed on large‐scale herbarium data sets were developed and tested in this study. Methods Herbarium specimens represent a significant resource with which to study plant phenology. Nevertheless, phenological annotation of herbarium specimens is time‐consuming, requires substantial human investment, and is difficult to mobilize at large taxonomic scales. We created and evaluated new methods based on deep learning techniques to automate annotation of phenological stages and tested these methods on four herbarium data sets representing temperate, tropical, and equatorial American floras. Results Deep learning allowed correct detection of fertile material with an accuracy of 96.3%. Accuracy was slightly decreased for finer‐scale information (84.3% for flower and 80.5% for fruit detection). Discussion The method described has the potential to allow fine‐grained phenological annotation of herbarium specimens at large ecological scales. Deeper investigation regarding the taxonomic scalability of this approach is needed.

show abstract

Machine Learning Using Digitized Herbarium Specimens to Advance Phenological Research

Pearson

Nelson

Aronson

et al. 2020

View full text Add to dashboard Cite

Abstract Machine learning (ML) has great potential to drive scientific discovery by harvesting data from images of herbarium specimens—preserved plant material curated in natural history collections—but ML techniques have only recently been applied to this rich resource. ML has particularly strong prospects for the study of plant phenological events such as growth and reproduction. As a major indicator of climate change, driver of ecological processes, and critical determinant of plant fitness, plant phenology is an important frontier for the application of ML techniques for science and society. In the present article, we describe a generalized, modular ML workflow for extracting phenological data from images of herbarium specimens, and we discuss the advantages, limitations, and potential future improvements of this workflow. Strategic research and investment in specimen-based ML methods, along with the aggregation of herbarium specimen data, may give rise to a better understanding of life on Earth.

show abstract

Multi-Label Learning from Single Positive Labels

Cole

Aodha

Lorieul

et al. 2021

View full text Add to dashboard Cite

Predicting all applicable labels for a given image is known as multi-label classification. Compared to the standard multi-class case (where each image has only one label), it is considerably more challenging to annotate training data for multi-label classification. When the number of potential labels is large, human annotators find it difficult to mention all applicable labels for each training image. Furthermore, in some settings detection is intrinsically difficult e.g. finding small object instances in high resolution images. As a result, multi-label training data is often plagued by false negatives. We consider the hardest version of this problem, where annotators provide only one relevant label for each image. As a result, training sets will have only one positive label per image and no confirmed negatives. We explore this special case of learning from missing labels across four different multi-label image classification datasets for both linear classifiers and end-to-end finetuned deep networks. We extend existing multi-label losses to this setting and propose novel variants that constrain the number of expected positive labels during training. Surprisingly, we show that in some cases it is possible to approach the performance of fully labeled classifiers despite training with significantly fewer confirmed labels.

show abstract

Overview of LifeCLEF 2020: A System-Oriented Evaluation of Automated Species Identification and Species Distribution Prediction

Joly

Goëau

Kahl

et al. 2020

View full text Add to dashboard Cite

Building accurate knowledge of the identity, the geographic distribution and the evolution of species is essential for the sustainable development of humanity, as well as for biodiversity conservation. However, the difficulty of identifying plants and animals in the field is hindering the aggregation of new data and knowledge. Identifying and naming living plants or animals is almost impossible for the general public and is often difficult even for professionals and naturalists. Bridging this gap is a key step towards enabling effective biodiversity monitoring systems. The LifeCLEF campaign, presented in this paper, has been promoting and evaluating advances in this domain since 2011. The 2020 edition proposes four data-oriented challenges related to the identification and prediction of biodiversity: (i) PlantCLEF: cross-domain plant identification based on herbarium sheets (ii) BirdCLEF: bird species recognition in audio soundscapes, (iii) GeoLifeCLEF: location-based prediction of species based on environmental and occurrence data, and (iv) SnakeCLEF: snake identification based on image and geographic location.

show abstract

Overview of LifeCLEF 2021: An Evaluation of Machine-Learning Based Species Identification and Species Distribution Prediction

Joly

Goëau

Kahl

et al. 2021

View full text Add to dashboard Cite

Building accurate knowledge of the identity, the geographic distribution and the evolution of species is essential for the sustainable development of humanity, as well as for biodiversity conservation. However, the difficulty of identifying plants and animals is hindering the aggregation of new data and knowledge. Identifying and naming living plants or animals is almost impossible for the general public and is often difficult even for professionals and naturalists. Bridging this gap is a key step towards enabling effective biodiversity monitoring systems. The LifeCLEF campaign, presented in this paper, has been promoting and evaluating advances in this domain since 2011. The 2021 edition proposes four data-oriented challenges related to the identification and prediction of biodiversity: (i) PlantCLEF: cross-domain plant identification based on herbarium sheets, (ii) BirdCLEF: bird species recognition in audio soundscapes, (iii) GeoLifeCLEF: remote sensing based prediction of species, and (iv) SnakeCLEF: Automatic Snake Species Identification with Country-Level Focus. LifeCLEF Lab OverviewAccurately identifying organisms observed in the wild is an essential step in ecological studies. Unfortunately, observing and identifying living organisms requires high levels of expertise. For instance, plants alone account for more than

show abstract

Categorizing plant images at the variety level: Did you say fine-grained?

Champ

Lorieul

Bonnet

et al. 2016

Pattern Recognition Letters

View full text Add to dashboard Cite

International audienceThis paper addresses the problem of categorizing plant images at the variety level, i.e. at a finer taxonomic grain than state-of-the-art studies usually working at the species level. It therefore introduces two new evaluation datasets of agro-biodiversity interest, each being related to concrete scenarios on large-scale plant resources. They have been chosen so as to involve very different acquisition protocols and visual patterns in order to evaluate if state-of-the-art image classification techniques can generalize to such specific contexts and avoid the cost of building specific ad-hoc solutions. The first one is a collection of 2071 pictures of loose rice seeds built from 95 accessions kept in a bank of seeds. The second one is a collection of 2037 pictures of grape leaves taken in the fields and belonging to 34 varieties among the most commonly ones used in viticulture. Both datasets exhibit a very low inter-class variability resulting in two challenging fine-grained classification tasks, even for expert human operators. A baseline experimental study was conducted on the two datasets using the two most effective families of classification techniques in the state-of-the-art, i.e. convolutional neural networks on one side and fisher vectors-based discriminant models on the other side. It shows that the achieved classification performance is very different between the two problems. It is actually pretty bad for the grape leaves collection but much better in the case of the rice seeds collection for which the acquisition protocol was much more constrained and the morphological variability more visible. The conclusion is that automatically identifying plant varieties might already be feasible for some specific scenarios and in controlled environments but that it is still an open problem in the general case

show abstract

Set-valued classification -- overview via a unified framework

Chzhen¹,

Denis²,

Hebiri³

et al. 2021

Preprint

View full text Add to dashboard Cite

Multi-class classification problem is among the most popular and well-studied statistical frameworks. Modern multi-class datasets can be extremely ambiguous and single-output predictions fail to deliver satisfactory performance. By allowing predictors to predict a set of label candidates, set-valued classification offers a natural way to deal with this ambiguity. Several formulations of set-valued classification are available in the literature and each of them leads to different prediction strategies. The present survey aims to review popular formulations using a unified statistical framework. The proposed framework encompasses previously considered and leads to new formulations as well as it allows to understand underlying trade-offs of each formulation. We provide infinite sample optimal set-valued classification strategies and review a general plug-in principle to construct data-driven algorithms. The exposition is supported by examples and pointers to both theoretical and practical contributions. Finally, we provide experiments on real-world datasets comparing these approaches in practice and providing general practical guidelines.

show abstract

Can Artificial Intelligence Help in the Study of Vegetative Growth Patterns from Herbarium Collections? An Evaluation of the Tropical Flora of the French Guiana Forest

Goëau

Lorieul

Heuret

et al. 2022

Plants

View full text Add to dashboard Cite

A better knowledge of tree vegetative growth phenology and its relationship to environmental variables is crucial to understanding forest growth dynamics and how climate change may affect it. Less studied than reproductive structures, vegetative growth phenology focuses primarily on the analysis of growing shoots, from buds to leaf fall. In temperate regions, low winter temperatures impose a cessation of vegetative growth shoots and lead to a well-known annual growth cycle pattern for most species. The humid tropics, on the other hand, have less seasonality and contain many more tree species, leading to a diversity of patterns that is still poorly known and understood. The work in this study aims to advance knowledge in this area, focusing specifically on herbarium scans, as herbariums offer the promise of tracking phenology over long periods of time. However, such a study requires a large number of shoots to be able to draw statistically relevant conclusions. We propose to investigate the extent to which the use of deep learning can help detect and type-classify these relatively rare vegetative structures in herbarium collections. Our results demonstrate the relevance of using herbarium data in vegetative phenology research as well as the potential of deep learning approaches for growing shoot detection.

show abstract

12 3

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Titouan Lorieul

Toward a large‐scale and deep phenological stage annotation of herbarium specimens: Case studies from temperate, tropical, and equatorial floras

Machine Learning Using Digitized Herbarium Specimens to Advance Phenological Research

Multi-Label Learning from Single Positive Labels

Overview of LifeCLEF 2020: A System-Oriented Evaluation of Automated Species Identification and Species Distribution Prediction

Overview of LifeCLEF 2021: An Evaluation of Machine-Learning Based Species Identification and Species Distribution Prediction

Categorizing plant images at the variety level: Did you say fine-grained?

Set-valued classification -- overview via a unified framework

Can Artificial Intelligence Help in the Study of Vegetative Growth Patterns from Herbarium Collections? An Evaluation of the Tropical Flora of the French Guiana Forest

Contact Info

Product

Resources

About