2019
DOI: 10.1177/2378023118818720
|View full text |Cite
|
Sign up to set email alerts
|

A Data-Driven Approach to the Fragile Families Challenge: Prediction through Principal-Components Analysis and Random Forests

Abstract: Sociological research typically involves exploring theoretical relationships, but the emergence of “big data” enables alternative approaches. This work shows the promise of data-driven machine-learning techniques involving feature engineering and predictive model optimization to address a sociological data challenge. The author’s group develops improved generalizable models to identify at-risk families. Principal-components analysis and decision tree modeling are used to predict six main dependent variables in… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
11
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 8 publications
(11 citation statements)
references
References 11 publications
0
11
0
Order By: Relevance
“…Furthermore, among the collected data, not all factors have a high correlation with the target of the forecast. erefore, before data analysis, database normalization and dimensionality reduction with principal component analysis (PCA) [38][39][40] are required to retain the eigenvalues with higher correlations to the target of the forecast, take the eigenvalues as the variables, and obtain the dependent variable for sales forecasting.…”
Section: Data Collection and Preprocessmentioning
confidence: 99%
“…Furthermore, among the collected data, not all factors have a high correlation with the target of the forecast. erefore, before data analysis, database normalization and dimensionality reduction with principal component analysis (PCA) [38][39][40] are required to retain the eigenvalues with higher correlations to the target of the forecast, take the eigenvalues as the variables, and obtain the dependent variable for sales forecasting.…”
Section: Data Collection and Preprocessmentioning
confidence: 99%
“…These new version sometimes introduce what are called breaking changes, which cause code that used to work to suddenly break. For example, the improvements between Python 2 and Python 3 introduced some breaking changes into Compton (2019), which we discovered when we attempted to use Python 3 to import preprocessed data files that had been serialized with Python 2. Furthermore, new versions sometimes cause much more subtle errors.…”
Section: Five-item Checklistmentioning
confidence: 99%
“…These new version sometimes introduce what are called breaking changes, which cause code that used to work to suddenly break. For example, the improvements between Python 2 and Python 3 introduced some breaking changes into Compton (2019), which we discovered when we attempted to use Python 3 to import pre-processed data files that had been serialized with Python 2. Further, new versions sometimes cause much more subtle errors.…”
Section: Requirements Filesmentioning
confidence: 99%