The rapid increase in both the quantity
and complexity of data
that are being generated daily in the field of environmental science
and engineering (ESE) demands accompanied advancement in data analytics.
Advanced data analysis approaches, such as machine learning (ML),
have become indispensable tools for revealing hidden patterns or deducing
correlations for which conventional analytical methods face limitations
or challenges. However, ML concepts and practices have not been widely
utilized by researchers in ESE. This feature explores the potential
of ML to revolutionize data analysis and modeling in the ESE field,
and covers the essential knowledge needed for such applications. First,
we use five examples to illustrate how ML addresses complex ESE problems.
We then summarize four major types of applications of ML in ESE: making
predictions; extracting feature importance; detecting anomalies; and
discovering new materials or chemicals. Next, we introduce the essential
knowledge required and current shortcomings in ML applications in
ESE, with a focus on three important but often overlooked components
when applying ML: correct model development, proper model interpretation,
and sound applicability analysis. Finally, we discuss challenges and
future opportunities in the application of ML tools in ESE to highlight
the potential of ML in this field.
Background
Hyperspectral reflectance data in the visible, near infrared and shortwave infrared range (VIS–NIR–SWIR, 400–2500 nm) are commonly used to nondestructively measure plant leaf properties. We investigated the usefulness of VIS–NIR–SWIR as a high-throughput tool to measure six leaf properties of maize plants including chlorophyll content (CHL), leaf water content (LWC), specific leaf area (SLA), nitrogen (N), phosphorus (P), and potassium (K). This assessment was performed using the lines of the maize diversity panel. Data were collected from plants grown in greenhouse condition, as well as in the field under two nitrogen application regimes. Leaf-level hyperspectral data were collected with a VIS–NIR–SWIR spectroradiometer at tasseling. Two multivariate modeling approaches, partial least squares regression (PLSR) and support vector regression (SVR), were employed to estimate the leaf properties from hyperspectral data. Several common vegetation indices (VIs: GNDVI, RENDVI, and NDWI), which were calculated from hyperspectral data, were also assessed to estimate these leaf properties.
Results
Some VIs were able to estimate CHL and N (R
2
> 0.68), but failed to estimate the other four leaf properties. Models developed with PLSR and SVR exhibited comparable performance to each other, and provided improved accuracy relative to VI models. CHL were estimated most successfully, with R
2
(coefficient of determination) > 0.94 and ratio of performance to deviation (RPD) > 4.0. N was also predicted satisfactorily (R
2
> 0.85 and RPD > 2.6). LWC, SLA and K were predicted moderately well, with R
2
ranging from 0.54 to 0.70 and RPD from 1.5 to 1.8. The lowest prediction accuracy was for P, with R
2
< 0.5 and RPD < 1.4.
Conclusion
This study showed that VIS–NIR–SWIR reflectance spectroscopy is a promising tool for low-cost, nondestructive, and high-throughput analysis of a number of leaf physiological and biochemical properties. Full-spectrum based modeling approaches (PLSR and SVR) led to more accurate prediction models compared to VI-based methods. We called for the construction of a leaf VIS–NIR–SWIR spectral library that would greatly benefit the plant phenotyping community for the research of plant leaf traits.
Background
Leaf chlorophyll content plays an important role in indicating plant stresses and nutrient status. Traditional approaches for the quantification of chlorophyll content mainly include acetone ethanol extraction, spectrophotometry and high-performance liquid chromatography. Such destructive methods based on laboratory procedures are time consuming, expensive, and not suitable for high-throughput analysis. High throughput imaging techniques are now widely used for non-destructive analysis of plant phenotypic traits. In this study three imaging modules (RGB, hyperspectral, and fluorescence imaging) were, separately and in combination, used to estimate chlorophyll content of sorghum plants in a greenhouse environment. Color features, spectral indices, and chlorophyll fluorescence intensity were extracted from these three types of images, and multiple linear regression models and PLSR (partial least squares regression) models were built to predict leaf chlorophyll content (measured by a handheld leaf chlorophyll meter) from the image features.
Results
The models with a single color feature from RGB images predicted chlorophyll content with R2 ranging from 0.67 to 0.88. The models using the three spectral indices extracted from hyperspectral images (Ration Vegetation Index, Normalized Difference Vegetation Index, and Modified Chlorophyll Absorption Ratio Index) predicted chlorophyll content with R2 ranging from 0.77 to 0.78. The model using the fluorescence intensity extracted from fluorescence images predicted chlorophyll content with R2 of 0.79. The PLSR model that involved all the image features extracted from the three different imaging modules exhibited the best performance for predicting chlorophyll content, with R2 of 0.90. It was also found that inclusion of SLW (Specific Leaf Weight) into the image-based models further improved the chlorophyll prediction accuracy.
Conclusion
All three imaging modules (RGB, hyperspectral, and fluorescence) tested in our study alone could estimate chlorophyll content of sorghum plants reasonably well. Fusing image features from different imaging modules with PLSR modeling significantly improved the predictive performance. Image-based phenotyping could provide a rapid and non-destructive approach for estimating chlorophyll content in sorghum.
Key message
The lack of efficient phenotyping capacities has been recognized as a bottleneck in forestry phenotyping and breeding. Modern phenotyping technologies use systems equipped with various imaging sensors to automatically collect high volume phenotypic data that can be used to assess trees' various attributes.
Context
Efficient phenotyping has the potential to spark a new Green Revolution, and it would provide an opportunity to acquire growth parameters and dissect the genetic bases of quantitative traits. Phenotyping platforms aim to link information from several sources to derive knowledge about trees' attributes.
Aims
Various tree phenotyping techniques were reviewed and analyzed along with their different applications.
Methods
This article presents the definition and characteristics of forest tree phenotyping and reviews newly developed imaging-based practices in forest tree phenotyping.
Results
This review addressed a wide range of forest trees phenotyping applications, including a survey of actual inter- and intra-specific variability, evaluating genotypes and species response to biotic and abiotic stresses, and phenological measurements.
Conclusion
With the support of advanced phenotyping platforms, the efficiency of traits phenotyping in forest tree breeding programs is accelerated.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.