Methods for apportioning sources of ambient particulate matter (PM) using the positive matrix factorization (PMF) algorithm are reviewed. Numerous procedural decisions must be made and algorithmic parameters selected when analyzing PM data with PMF. However, few publications document enough of these details for readers to evaluate, reproduce, or compare results between different studies. For example, few studies document why some species were used and others not used in the modeling, how the number of factors was selected, or how much uncertainty exists in the solutions. More thorough documentation will aid the development of standard protocols for analyzing PM data with PMF and will reveal more clearly where research is needed to help future analysts select from the various possible procedures and parameters available in PMF. For example, research likely is needed to determine optimal approaches for handling data below detection limits, ways to apportion PM mass among sources identified by PMF, and ways to estimate uncertainties in the solution. The review closes with recommendations for documenting the methodological details of future PMF analyses.
Abstract. The EPA PMF (Environmental Protection Agency positive matrix factorization) version 5.0 and the underlying multilinear engine-executable ME-2 contain three methods for estimating uncertainty in factor analytic models: classical bootstrap (BS), displacement of factor elements (DISP), and bootstrap enhanced by displacement of factor elements (BS-DISP). The goal of these methods is to capture the uncertainty of PMF analyses due to random errors and rotational ambiguity. It is shown that the three methods complement each other: depending on characteristics of the data set, one method may provide better results than the other two. Results are presented using synthetic data sets, including interpretation of diagnostics, and recommendations are given for parameters to report when documenting uncertainty estimates from EPA PMF or ME-2 applications.
The new version of EPA's positive matrix factorization (EPA PMF) software, 5.0, includes three error estimation (EE) methods for analyzing factor analytic solutions: classical bootstrap (BS), displacement of factor elements (DISP), and bootstrap enhanced by displacement (BS-DISP). These methods capture the uncertainty of PMF analyses due to random errors and rotational ambiguity. To demonstrate the utility of the EE methods, results are presented for three data sets: (1) speciated PM2.5 data from a chemical speciation network (CSN) site in Sacramento, California (2003-2009); (2) trace metal, ammonia, and other species in water quality samples taken at an inline storage system (ISS) in Milwaukee, Wisconsin (2006); and (3) an organic aerosol data set from high-resolution aerosol mass spectrometer (HR-AMS) measurements in Las Vegas, Nevada (January 2008). We present an interpretation of EE diagnostics for these data sets, results from sensitivity tests of EE diagnostics using additional and fewer factors, and recommendations for reporting PMF results. BS-DISP and BS are found useful in understanding the uncertainty of factor profiles; they also suggest if the data are over-fitted by specifying too many factors. DISP diagnostics were consistently robust, indicating its use for understanding rotational uncertainty and as a first step in assessing a solution's viability. The uncertainty of each factor's identifying species is shown to be a useful gauge for evaluating multiple solutions, e.g., with a different number of factors.
This work analyzes PM2.5 24-h average concentrations measured every third day at over 300 locations in the eastern United States during 2000. The non-negative factor analytic model, Positive Matrix Factorization, has been enhanced by modeling the dependence of PM2.5 concentrations on temperature, humidity, pressure, ozone concentrations, and wind velocity vectors. The model comprises 12 general factors, augmented by 5 urban-only factors intended to represent excess concentration present in urban locations only. The computed factor components or concentration fields are displayed as concentration maps, one for each factor, showing how much each factor contributes to the average concentration at each location. The factors are also displayed as flux maps that illustrate the spatial movement of PM2.5 aerosol, thus enabling one to pinpoint potential source areas of PM2.5. The quality of the results was investigated by examining how well the model reproduces especially high concentrations of PM2.5 on specific days at specific locations. Delimiting the spatial extent of all such factors that exhibit a clear regional maximum surrounded by an almost-zero outer domain lowered the uncertainty in the computed results.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.