2024
DOI: 10.3389/fbioe.2024.1228846
|View full text |Cite
|
Sign up to set email alerts
|

Generative data augmentation and automated optimization of convolutional neural networks for process monitoring

Robin Schiemer,
Matthias Rüdt,
Jürgen Hubbuch

Abstract: Chemometric modeling for spectral data is considered a key technology in biopharmaceutical processing to realize real-time process control and release testing. Machine learning (ML) models have been shown to increase the accuracy of various spectral regression and classification tasks, remove challenging preprocessing steps for spectral data, and promise to improve the transferability of models when compared to commonly applied, linear methods. The training and optimization of ML models require large data sets… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
1

Relationship

1
0

Authors

Journals

citations
Cited by 1 publication
(2 citation statements)
references
References 83 publications
(163 reference statements)
0
2
0
Order By: Relevance
“…In general, diversification strategies were reported to improve model robustness and interpretability (Santos et al, 2018) or enlarge the experimental data set (Wang et al, 2023). Alternative strategies may include synthetic enlargement of the available data sets by data augmentation (Schiemer et al, 2024) or the collection of data from multiple products, formulation components and sensors (Wei et al, 2022) for the diversification of experimental data sets.…”
Section: Effects Of Data Diversificationmentioning
confidence: 99%
See 1 more Smart Citation
“…In general, diversification strategies were reported to improve model robustness and interpretability (Santos et al, 2018) or enlarge the experimental data set (Wang et al, 2023). Alternative strategies may include synthetic enlargement of the available data sets by data augmentation (Schiemer et al, 2024) or the collection of data from multiple products, formulation components and sensors (Wei et al, 2022) for the diversification of experimental data sets.…”
Section: Effects Of Data Diversificationmentioning
confidence: 99%
“…To effectively reduce the noise in the model, a larger data set preferably from fedbatch experiments would be required. Eventually, due to the nonlinear relationship of the Raman spectra and the VLP concentration, non-linear regression models should be evaluated such as kernelbased methods (Thissen et al, 2004;Barman et al, 2010;Zavala-Ortiz et al, 2020;Schiemer et al, 2023) or neural networks (Cui and Fearn, 2018;Wang et al, 2023;Schiemer et al, 2024).…”
Section: Effects Of Preprocessing Pipeline On Model Performancementioning
confidence: 99%