2022
DOI: 10.1289/ehp10479
|View full text |Cite
|
Sign up to set email alerts
|

Principal Component Pursuit for Pattern Identification in Environmental Mixtures

Abstract: Background: Environmental health researchers often aim to identify sources or behaviors that give rise to potentially harmful environmental exposures. Objective: We adapted principal component pursuit (PCP)—a robust and well-established technique for dimensionality reduction in computer vision and signal processing—to identify patterns in environmental mixtures. PCP decomposes the exposure mixture into a low-rank matrix containing consistent patterns of exposure across … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
20
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
7

Relationship

2
5

Authors

Journals

citations
Cited by 12 publications
(20 citation statements)
references
References 40 publications
0
20
0
Order By: Relevance
“…Finally, we found that when applying nonconvex falsePCP to speciated PM 2.5 data, it was necessary to tune hyperparameters, which is a time-intensive process that comes with a certain degree of researcher subjectivity. Other formulations of PCP have used theoretically-optimal single universal values for hyperparameters falseλ and falseμ, 17,18 but we found that these approaches were not flexible enough to detect the underlying patterns present in our dataset, as they require a better-defined low-rank structure.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…Finally, we found that when applying nonconvex falsePCP to speciated PM 2.5 data, it was necessary to tune hyperparameters, which is a time-intensive process that comes with a certain degree of researcher subjectivity. Other formulations of PCP have used theoretically-optimal single universal values for hyperparameters falseλ and falseμ, 17,18 but we found that these approaches were not flexible enough to detect the underlying patterns present in our dataset, as they require a better-defined low-rank structure.…”
Section: Discussionmentioning
confidence: 99%
“…We used square-root PCP (falsePCP), an extension of PCP, 18 and combined it with a separate extension introducing a nonconvex penalty on the low-rank matrix. 17 We used cross-validation to select the optimal rank of the low-rank matrix, which can be understood as the number of underlying patterns in the PM 2.5 data. Please see the Supplement for further details on hyperparameter selection.…”
Section: Methodsmentioning
confidence: 99%
“…In simulations, PCP-LOD generally outperformed PCA (e.g., PCP-LOD recovered a higher percentage of the true number of patterns) when the percentage of observations below the LOD was and in scenarios in which there was either low Gaussian noise or there were both low Gaussian noise and sparse events. 10 Further, PCP-LOD largely outperformed PCA when 16 chemicals were included in the mixture, but performance decreased when the number of chemicals in the mixture increased to 48. In the application to NHANES data, PCP-LOD produced results similar to those of PCA when applied to a mixture of 21 chemicals (including dioxins, furans, and polychlorinated biphenyls) with above the LOD.…”
mentioning
confidence: 94%
“…In their new study, 10 Gibson et al. adapted principal component pursuit (PCP), a robust method for dimensionality reduction and pattern identification, to accommodate missing data and values below the LOD.…”
mentioning
confidence: 99%
See 1 more Smart Citation