Anya Mathur scite author profile

Anya Mathur

5Publications

8Citation Statements Received

121Citation Statements Given

How they've been cited

How they cite others

120

Affiliations

University of Washington

Publications

Order By: Most citations

Explainable machine learning models to understand determinants of COVID-19 mortality in the United States

Mathur

Sethi

Mathur³

et al. 2020

Preprint

View full text Add to dashboard Cite

COVID-19 is now one of the leading causes of mortality amongst adults in the United States for the year 2020. Multiple epidemiological models have been built, often based on limited data, to understand the spread and impact of the pandemic. However, many geographic and local factors may have played an important role in higher morbidity and mortality in certain populations. The goal of this study was to develop machine learning models to understand the relative association of socioeconomic, demographic, travel, and health care characteristics of different states across the United States and COVID-19 mortality. Using multiple public data sets, 24 variables linked to COVID-19 disease were chosen to build the models. Two independent machine learning models using CatBoost regression and random forest were developed. SHAP feature importance and a Boruta algorithm were used to elucidate the relative importance of features on COVID-19 mortality in the United States. Feature importances from both the categorical models, i.e., CatBoost and random forest consistently showed that a high population density, number of nursing homes, number of nursing home beds and foreign travel were strongest predictors of COVID-19 mortality. Percentage of African American amongst the population was also found to be of high importance in prediction of COVID-19 mortality whereas racial majority (primarily, Caucasian) was not. Both models fitted the data well with a training R2 of 0.99 and 0.88 respectively. The effect of median age,median income, climate and disease mitigation measures on COVID-19 related mortality remained unclear. COVID-19 policy making will need to take population density, pre-existing medical care and state travel policies into account. Our models identified and quantified the relative importance of each of these for mortality predictions using machine learning.

show abstract

Pooled Prevalence of Adverse Pregnancy and Neonatal Outcomes in Malawi, South Africa, Uganda, and Zimbabwe: Results From a Systematic Review and Meta-Analyses to Inform Trials of Novel HIV Prevention Interventions During Pregnancy

Lokken

Mathur

Bunge

et al. 2021

Front. Reprod. Health

View full text Add to dashboard Cite

Background: Robust data summarizing the prevalence of pregnancy and neonatal outcomes in low- and middle-income countries are critically important for studies evaluating investigational products for HIV prevention and treatment in pregnant and breastfeeding women. In preparation for studies evaluating the safety of the dapivirine vaginal ring for HIV prevention in pregnancy, we conducted a systematic literature review and meta-analyses to summarize the prevalence of pregnancy and neonatal outcomes in Malawi, South Africa, Uganda, and Zimbabwe.Methods: Ten individual systematic literature reviews were conducted to identify manuscripts presenting prevalence data for 12 pregnancy and neonatal outcomes [pregnancy loss, stillbirth, preterm birth, low birthweight (LBW), neonatal mortality, congenital anomaly, chorioamnionitis, postpartum endometritis, postpartum hemorrhage, gestational hypertension, preeclampsia/eclampsia, and preterm premature rupture of membranes (PPROM)]. Studies included in the meta-analyses were published between January 1, 1998, and July 11, 2018, provided numerator and denominator data to support prevalence estimation, and included women of any HIV serostatus. Random-effects meta-analyses were conducted to estimate the pooled prevalence and 95% confidence interval (CI) for each outcome overall, by country, and by HIV status.Results: A total of 152 manuscripts were included across the 12 outcomes. Overall, the frequency of stillbirth (n = 75 estimates), LBW (n = 68), and preterm birth (n = 67) were the most often reported. However, fewer than 10 total manuscripts reported prevalence estimates for chorioamnionitis, endometritis, or PPROM. The outcomes with the highest pooled prevalence were preterm birth (12.7%, 95%CI 11.2–14.3), LBW (11.7%, 95%CI 10.6–12.9), and gestational hypertension (11.4%, 95%CI 7.8–15.7). Among the outcomes with the lowest pooled prevalence estimates were neonatal mortality (1.7%, 95%CI 1.4–2.1), pregnancy loss [1.9%, 95%CI 1.1–2.8, predominately studies (23/29) assessing losses occurring after the first trimester], PPROM (2.2%, 95%CI 1.5–3.2), and stillbirth (2.5%, 95%CI 2.2–2.7).Conclusions: Although this review identified numerous prevalence estimates for some outcomes, data were lacking for other important pregnancy-related conditions. Additional research in pregnant populations is needed for a thorough evaluation of investigational products, including for HIV prevention and treatment, and to inform better estimates of the burden of adverse pregnancy outcomes globally.

show abstract

Implementation of a fully remote randomized clinical trial with cardiac monitoring

et al. 2021

View full text Add to dashboard Cite

Background The coronavirus disease 2019 (COVID-19) pandemic has challenged researchers performing clinical trials to develop innovative approaches to mitigate infectious risk while maintaining rigorous safety monitoring. Methods In this report we describe the implementation of a novel exclusively remote randomized clinical trial (ClinicalTrials.gov NCT04354428) of hydroxychloroquine and azithromycin for the treatment of the SARS-CoV-2–mediated COVID-19 disease which included cardiovascular safety monitoring. All study activities were conducted remotely. Self-collected vital signs (temperature, respiratory rate, heart rate, and oxygen saturation) and electrocardiographic (ECG) measurements were transmitted digitally to investigators while mid-nasal swabs for SARS-CoV-2 testing were shipped. ECG collection relied on a consumer device (KardiaMobile 6L, AliveCor Inc.) that recorded and transmitted six-lead ECGs via participants’ internet-enabled devices to a central core laboratory, which measured and reported QTc intervals that were then used to monitor safety. Results Two hundred and thirty-one participants uploaded 3245 ECGs. Mean daily adherence to the ECG protocol was 85.2% and was similar to the survey and mid-nasal swab elements of the study. Adherence rates did not differ by age or sex assigned at birth and were high across all reported race and ethnicities. QTc prolongation meeting criteria for an adverse event occurred in 28 (12.1%) participants, with 2 occurring in the placebo group, 19 in the hydroxychloroquine group, and 7 in the hydroxychloroquine + azithromycin group. Conclusions Our report demonstrates that digital health technologies can be leveraged to conduct rigorous, safe, and entirely remote clinical trials.

show abstract

Managing the Infodemic: Leveraging Deep Learning to Evaluate the Maturity Level of AI-Based COVID-19 Publications for Knowledge Surveillance and Decision Support

Awasthi¹,

Nagori²,

Mishra³

et al. 2023

Preprint

View full text Add to dashboard Cite

COVID-19 pandemic has taught us many lessons, including the need to manage the exponential growth of knowledge, fast-paced development or modification of existing AI models, limited opportunities to conduct extensive validation studies, the need to understand bias and mitigate it, and lastly, implementation challenges related to AI in healthcare. While the nature of the dynamic pandemic, resource limitations, and evolving pathogens were key to some of the failures of AI to help manage the disease, the infodemic during the pandemic could be a key opportunity that we could manage better. We share our research related to the use of deep learning methods to quantitatively and qualitatively evaluate AI-based COVID-19 publications which provides a unique approach to identify mature publications using a validated model and how that can be leveraged further by focused human-in-loop analysis. The study utilized research articles in English that were human-based, extracted from PubMed spanning the years 2020 to 2022. The findings highlight notable patterns in publication maturity over the years, with consistent and significant contributions from China and the United States. The analysis also emphasizes the prevalence of image datasets and variations in employed AI model types. To manage an infodemic during a pandemic, we provide a specific knowledge surveillance method to identify key scientific publications in near real-time. We hope this will enable data-driven and evidence-based decisions that clinicians, data scientists, researchers, policymakers, and public health officials need to make with time sensitivity while keeping humans in the loop.

show abstract

Explainable machine learning models to understand determinants of COVID-19 mortality in the United States (Preprint)

Mathur¹,

Sethi²,

Mathur³

et al. 2020

Preprint

View full text Add to dashboard Cite

UNSTRUCTURED Introduction The COVID-19 pandemic exhibits an uneven geographic spread which leads to a locational mismatch of testing, mitigation measures and allocation of healthcare resources (human, equipment, and infrastructure).(1) In the absence of effective treatment, understanding and predicting the spread of COVID-19 is unquestionably valuable for public health and hospital authorities to plan for and manage the pandemic. While there have been many models developed to predict mortality, the authors sought to develop a machine learning prediction model that provides an estimate of the relative association of socioeconomic, demographic, travel, and health care characteristics of COVID-19 disease mortality among states in the United States(US). Methods State-wise data was collected for all the features predicting COVID-19 mortality and for deriving feature importance (eTable 1 in the Supplement).(2) Key feature categories include demographic characteristics of the population, pre-existing healthcare utilization, travel, weather, socioeconomic variables, racial distribution and timing of disease mitigation measures (Figure 1 & 2). Two machine learning models, Catboost regression and random forest were trained independently to predict mortality in states on data partitioned into a training (80%) and test (20%) set.(3) Accuracy of models was assessed by R2 score. Importance of the features for prediction of mortality was calculated via two machine learning algorithms - SHAP (SHapley Additive exPlanations) calculated upon CatBoost model and Boruta, a random forest based method trained with 10,000 trees for calculating statistical significance (3-5). Results Results are based on 60,604 total deaths in the US, as of April 30, 2020. Actual number of deaths ranged widely from 7 (Wyoming) to 18,909 (New York).CatBoost regression model obtained an R2 score of 0.99 on the training data set and 0.50 on the test set. Random Forest model obtained an R2 score of 0.88 on the training data set and 0.39 on the test set. Nine out of twenty variables were significantly higher than the maximum variable importance achieved by the shadow dataset in Boruta regression (Figure 2).Both models showed the high feature importance for pre-existing high healthcare utilization reflective in nursing home beds per capita and doctors per 100,000 population. Overall population characteristics such as total population and population density also correlated positively with the number of deaths.Notably, both models revealed a high positive correlation of deaths with percentage of African Americans. Direct flights from China, especially Wuhan were also significant in both models as predictors of death, therefore reflecting early spread of the disease. Associations between deaths and weather patterns, hospital bed capacity, median age, timing of administrative action to mitigate disease spread such as the closure of educational institutions or stay at home order were not significant. The lack of some associations, e.g., administrative action may reflect delayed outcomes of interventions which were not yet reflected in data. Discussion COVID-19 disease has varied spread and mortality across communities amongst different states in the US. While our models show that high population density, pre-existing need for medical care and foreign travel may increase transmission and thus COVID-19 mortality, the effect of geographic, climate and racial disparities on COVID-19 related mortality is not clear. The purpose of our study was not state-wise accurate prediction of deaths in the US, which has already been challenging.(6) Location based understanding of key determinants of COVID-19 mortality, is critically needed for focused targeting of mitigation and control measures. Risk assessment-based understanding of determinants affecting COVID-19 outcomes, using a dynamic and scalable machine learning model such as the two proposed, can help guide resource management and policy framework.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Anya Mathur

Explainable machine learning models to understand determinants of COVID-19 mortality in the United States

Pooled Prevalence of Adverse Pregnancy and Neonatal Outcomes in Malawi, South Africa, Uganda, and Zimbabwe: Results From a Systematic Review and Meta-Analyses to Inform Trials of Novel HIV Prevention Interventions During Pregnancy

Implementation of a fully remote randomized clinical trial with cardiac monitoring

Managing the Infodemic: Leveraging Deep Learning to Evaluate the Maturity Level of AI-Based COVID-19 Publications for Knowledge Surveillance and Decision Support

Explainable machine learning models to understand determinants of COVID-19 mortality in the United States (Preprint)

Contact Info

Product

Resources

About