Jason Poulos scite author profile

Jason Poulos

4Publications

48Citation Statements Received

105Citation Statements Given

How they've been cited

How they cite others

125

103

Affiliations

Harvard University, University of California, Berkeley, Duke University

Publications

Order By: Most citations

Missing Data Imputation for Supervised Learning

Poulos

Valle

2018

Applied Artificial Intelligence

View full text Add to dashboard Cite

Missing data imputation can help improve the performance of prediction models in situations where missing data hide useful information. This paper compares methods for imputing missing categorical data for supervised classification tasks. We experiment on two machine learning benchmark datasets with missing categorical data, comparing classifiers trained on non-imputed (i.e., one-hot encoded) or imputed data with different levels of additional missing-data perturbation. We show imputation methods can increase predictive accuracy in the presence of missing-data perturbation, which can actually improve prediction accuracy by regularizing the classifier. We achieve the state-of-the-art on the Adult dataset with missing-data perturbation and k-nearest-neighbors (k-NN) imputation.

show abstract

Are deep learning models superior for missing data imputation in large surveys? Evidence from an empirical comparison

Wang¹,

Akande²,

Poulos³

et al. 2021

Preprint

View full text Add to dashboard Cite

Multiple imputation (MI) is the state-of-the-art approach for dealing with missing data arising from non-response in sample surveys. Multiple imputation by chained equations (MICE) is the most widely used MI method, but it lacks theoretical foundation and is computationally intensive. Recently, MI methods based on deep learning models have been developed with encouraging results in small studies. However, there has been limited research on systematically evaluating their performance in realistic settings comparing to MICE, particularly in large-scale surveys. This paper provides a general framework for using simulations based on real survey data and several performance metrics to compare MI methods. We conduct extensive simulation studies based on the American Community Survey data to compare repeated sampling properties of four machine learning based MI methods: MICE with classification trees, MICE with random forests, generative adversarial imputation network, and multiple imputation using denoising autoencoders. We find the deep learning based MI methods dominate MICE in terms of computational time; however, MICE with classification trees consistently outperforms the deep learning MI methods in terms of bias, mean squared error, and coverage under a range of realistic settings.

show abstract

Self-ligation shortens chair time and compounds savings, with external bracket hygiene compared to conventional ligation: Systematic review with meta-analysis of randomized controlled trials

Voudouris¹,

Suri²,

Tompson³

et al. 2018

Dent Oral Craniofac Res

View full text Add to dashboard Cite

Objective: To test if there are significant evidence-based differences in effectiveness between self-ligation (SL) and conventional-ligation (CL) brackets. Materials and Methods:Popular clinical claims of SL were identified through a literature overview of PubMed, EMBASE, Cochrane Library, and Web of Science for the period 1965-2017. Additional hand searching of the references from retrieved articles was completed. The articles containing the inclusion criteria were qualitatively analyzed using the Cochrane risk of bias tool, and one other scale. Applicable RCTs were statistically analyzed with weighted means calculations and forest plots. RCT data that could not be synthesized with one other RCT at this time were reserved for discussion. Results:The inclusion criteria were satisfied by a total of ten RCT studies, six of which were matched for meta-analysis of three popular clinical claims. Space closure rate, reduced incisor proclination, and the rate of mandibular alignment for SL compared to CL were not statistically significant with confidence intervals of 95%. The remaining four RCTs were collectively analyzed and found no statistically significant difference in discomfort between SL and CL. Conclusion:The null hypothesis that there are no differences between SL and CL, was not rejected due to statistically insignificant results. Additional active SL studies, and well-designed RCTs for MA are needed that includes overall treatment time. SL chair time efficiency was consistently higher versus CL.

show abstract

Character-based handwritten text transcription with attention networks

Poulos

Valle

2021

Neural Comput & Applic

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Jason Poulos

Missing Data Imputation for Supervised Learning

Are deep learning models superior for missing data imputation in large surveys? Evidence from an empirical comparison

Self-ligation shortens chair time and compounds savings, with external bracket hygiene compared to conventional ligation: Systematic review with meta-analysis of randomized controlled trials

Character-based handwritten text transcription with attention networks

Contact Info

Product

Resources

About