2020
DOI: 10.1002/lom3.10406
|View full text |Cite
|
Sign up to set email alerts
|

Predicting dissolved organic carbon concentration in a dynamic salt marsh creek via machine learning

Abstract: Dissolved organic carbon (DOC) is a master variable in aquatic systems. Resolving DOC dynamics requires high‐temporal resolution data. However, DOC concentration cannot be directly measured in situ, and discrete sample collection and analysis becomes expensive as temporal resolution increases. To surmount this problem, an option is to predict site‐specific DOC concentration with linear modeling and optical data predictors collected from high‐cost, high‐maintenance in situ spectrophotometers. This study sought … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
5
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5

Relationship

0
5

Authors

Journals

citations
Cited by 7 publications
(5 citation statements)
references
References 39 publications
0
5
0
Order By: Relevance
“…This can improve the model by capturing additional samples, particularly those with lower and high S R values, in a new training dataset. Furthermore, using a site and temporal variable, such as a point‐in‐year metric (Codden et al 2020), can help the model distinguish site‐specific and temporally‐specific signatures. Following the implementation of these new variables, model performance can further be evaluated with normalT2 and SPE to find out if they are likely to improve the model application across sites and periods.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…This can improve the model by capturing additional samples, particularly those with lower and high S R values, in a new training dataset. Furthermore, using a site and temporal variable, such as a point‐in‐year metric (Codden et al 2020), can help the model distinguish site‐specific and temporally‐specific signatures. Following the implementation of these new variables, model performance can further be evaluated with normalT2 and SPE to find out if they are likely to improve the model application across sites and periods.…”
Section: Discussionmentioning
confidence: 99%
“…Partial least squares (PLS) regression is commonly used in the field of chemometrics (Wold et al 2001) and, in recent years, has been successfully used to train models to monitor in situ DOC concentration from UV–Vis absorption of CDOM (Langergraber et al 2003; Avagyan et al 2014; Etheridge et al 2014; Vaughan et al 2017; Codden et al 2020; Zhu et al 2020; Table 1). The PLS method allows for the whole spectrum of UV–Vis absorption to be used to best predict DOC concentration without problems due to multicollinearity while providing consistency checks for new data used to inform the model's performance.…”
Section: Regression Methods Wavelengths (Nm) Catchment Type (Size Km2...mentioning
confidence: 99%
“…Alizadeh et al (2018) showed that ML models could accurately forecast estuarine water quality constituents, including salinity, up to 2 h in the future (Alizadeh et al 2018). Other studies have demonstrated how ML models can reduce cost in data collection and computation (Codden et al 2021). The flexibility and computational efficiency of such models raises exciting possibilities about their capacity to harness ever‐growing data availability to develop accurate models at speeds and resolutions not possible with traditional modeling approaches.…”
mentioning
confidence: 99%
“…The Random Forest algorithm is becoming increasingly popular as a simple-to-use machine learning method that allows for estimation of predictor importance (Breiman 2001). Recent examples of Random Forest implementations in aquatic biogeochemistry, including developing high-frequency proxies for parameters with limited temporal resolution (Castrillo and García 2020;Codden et al 2021;Green et al 2021), and linking in situ and remotely sensed parameters (Gu et al 2020), show great promise for understanding complex coastal biogeochemical processes. However, applying Random Forest to time series requires careful consideration of temporal dependence and other model parameterization decisions (Regier et al 2022).…”
mentioning
confidence: 99%