edarf: Exploratory Data Analysis using Random Forests

Jones, Zachary M.; Linder, Fridolin

doi:10.21105/joss.00092

Cited by 79 publications

(83 citation statements)

References 4 publications

(4 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Partial dependence plots allow RF models to be evaluated and to confirm how the explanatory variables are being used in the models for prediction (Jones and Linder, 2015). For the application presented here, there are general physical and chemical processes which should be confirmed in the RF models.…”

Section: Explaining the Observed Trendsmentioning

confidence: 91%

“…This allows RF to produce predictive models which generalise well and predictive performance is generally considered among the best of any ML technique (Caruana and Niculescu-Mizil, 2006). RF also has the advantage of not being a "black-box" method (Jones and Linder, 2015 the few ML techniques where the learning process can be explained, investigated, and interpreted. In the case of artificial neural networks or kernel based learning methods, this is much more difficult to do (Kotsiantis, 2013;Tong et al, 2003).…”

Section: Decision Trees and Random Forestmentioning

confidence: 99%

See 1 more Smart Citation

Random forest meteorological normalisation models for Swiss PM10 trend analysis

et al. 2018

View full text Add to dashboard Cite

Abstract. Meteorological normalisation is a technique which accounts for changes in meteorology over time in an air quality time series. Controlling for such changes helps support robust trend analysis because there is more certainty that the observed trends are due to changes in emissions or chemistry, not changes in meteorology. Predictive random forest models (RF; a decision tree machine learning technique) were grown for 31 air quality monitoring sites in Switzerland using surface meteorological, synoptic scale, boundary layer height, and time variables to explain daily PM 10 concentrations. The RF models were used to calculate meteorologically normalised trends which were formally tested and evaluated using the Theil-Sen estimator. Between 1997 and 2016, significantly decreasing normalised PM 10 trends ranged between −0.09 and −1.16 µg m −3 yr −1 with urban traffic sites experiencing the greatest mean decrease in PM 10 concentrations at −0.77 µg m −3 yr −1 . Similar magnitudes have been reported for normalised PM 10 trends for earlier time periods in Switzerland which indicates PM 10 concentrations are continuing to decrease at similar rates as in the past. The ability for RF models to be interpreted was leveraged using partial dependence plots to explain the observed trends and relevant physical and chemical processes influencing PM 10 concentrations. Notably, two regimes were suggested by the models which cause elevated PM 10 concentrations in Switzerland: one related to poor dispersion conditions and a second resulting from high rates of secondary PM generation in deep, photochemically active boundary layers. The RF meteorological normalisation process was found to be robust, user friendly and simple to implement, and readily interpretable which suggests the technique could be useful in many air quality exploratory data analysis situations.

show abstract

Section: Explaining the Observed Trendsmentioning

confidence: 91%

Section: Decision Trees and Random Forestmentioning

confidence: 99%

Random forest meteorological normalisation models for Swiss PM10 trend analysis

et al. 2018

View full text Add to dashboard Cite

show abstract

“…Here, this advantage will be leveraged to 10 help explain some of the features in the PM 10 trends in Switzerland between 1997 and 2016. Partial dependence plots allow RF models to be evaluated and to confirm how the explanatory variables are being used in the models for prediction (Jones and Linder, 2015). For the application presented here, there are general physical and chemical processes which should be confirmed in the RF models.…”

Section: Explaining the Observed Trendsmentioning

confidence: 91%

“…This allows RF to produce predictive models 15 which generalise well and predictive performance is generally considered among the best of any ML technique (Caruana and Niculescu-Mizil, 2006). RF also has the advantage of not being a "black-box" method (Jones and Linder, 2015). Decision trees are one of the few ML techniques where the learning process can be explained, investigated, and interpreted.…”

Section: Machine Learningmentioning

confidence: 99%

Random forest meteorological normalisation models for Swiss PM10 trend analysis

Grange¹,

Carslaw²,

Lewis³

et al. 2018

Preprint

View full text Add to dashboard Cite

Abstract.Meteorological normalisation is a technique which accounts for changes in meteorology over time in an air quality time series.Controlling for such changes helps support robust trend analysis because there is more certainty that the observed trends are due to changes in emissions or chemistry, not changes in meteorology. Predictive random forest models (RF; a decision tree machine learning technique) were grown for 31 air quality monitoring sites in Switzerland using surface meteorological, 5 synoptic scale, boundary layer height, and time variables to explain daily PM 10 concentrations. The RF models were used to calculate meteorologically normalised trends which were formally tested and evaluated using the Theil-Sen estimator. Between continuing to decrease at similar rates as in the past. The ability for RF models to be interpreted was leveraged using partial dependence plots to explain the observed trends and relevant physical and chemical processes influencing PM 10 concentrations.Notably, two regimes were suggested by the models which cause elevated PM 10 concentrations in Switzerland: one related to poor dispersion conditions and a second resulting from high rates of secondary PM generation in deep, photochemically active boundary layers. The RF meteorological normalisation process was found to be robust, user friendly and simple to implement, 15 and readily interpretable which suggests the technique could be useful in many air quality exploratory data analysis situations.

show abstract

“…For discussion on these important differences in conceptualisation see (Straus 2007;Finkel and Straus 2012) inants of state-sponsored atrocities. Our approach is similar to Hegre and Sambanis (2006) seminal analysis on the causes of civil war onset, but we provide additional tests to verify whether complex interactions and nonlinearities are driving the statistical results (Bell 2015;Jones and Linder 2015;Jones and Lupu 2018;Muchlinski et al 2015). In conducting this analysis, we address three debates in the mass violence literature:…”

mentioning

confidence: 99%

What Drives State-Sponsored Violence?: Evidence from Extreme Bounds Analysis and Ensemble Learning Models

Freire¹,

Uzonyi²

2018

Preprint

View full text Add to dashboard Cite

show abstract

edarf: Exploratory Data Analysis using Random Forests

Cited by 79 publications

References 4 publications

Random forest meteorological normalisation models for Swiss PM<sub>10</sub> trend analysis

Random forest meteorological normalisation models for Swiss PM<sub>10</sub> trend analysis

Random forest meteorological normalisation models for Swiss PM<sub>10</sub> trend analysis

What Drives State-Sponsored Violence?: Evidence from Extreme Bounds Analysis and Ensemble Learning Models

Contact Info

Product

Resources

About

edarf: Exploratory Data Analysis using Random Forests

Cited by 79 publications

References 4 publications

Random forest meteorological normalisation models for Swiss PM&lt;sub&gt;10&lt;/sub&gt; trend analysis

Random forest meteorological normalisation models for Swiss PM&lt;sub&gt;10&lt;/sub&gt; trend analysis

Random forest meteorological normalisation models for Swiss PM&lt;sub&gt;10&lt;/sub&gt; trend analysis

What Drives State-Sponsored Violence?: Evidence from Extreme Bounds Analysis and Ensemble Learning Models

Contact Info

Product

Resources

About

Random forest meteorological normalisation models for Swiss PM<sub>10</sub> trend analysis

Random forest meteorological normalisation models for Swiss PM<sub>10</sub> trend analysis

Random forest meteorological normalisation models for Swiss PM<sub>10</sub> trend analysis