Daniel Bao scite author profile

Project Background. Social determinants of health (SDoH), such as unstable employment during the pandemic, account for between 30-55% of people's health outcomes. While many studies have identified strong associations among specific SDoH and health outcomes, most people experience multiple SDoH in their daily lives. Analysis of this complexity requires the integration of personal, clinical, social, and environmental information from a large cohort of underrepresented populations, which is only recently being made available through the All of Us research program. However, little is known about the range and response of SDoH in All of Us, and how they co-occur to form subtypes, which are critical for designing precision medicine interventions. Research Questions. (1) What is the range and response to survey questions related to SDoH? (2) How do SDoH co-occur to form subtypes, and what are their risk for adverse health outcomes? Methods. For Question-1, we characterized the range of SDoH questions across the surveys, and analyzed their responses. For Question-2, we used the following steps: (1) due to the missingness across the surveys, selected all participants with valid and complete SDoH data, and used inverse probability weighting to adjust their imbalance in demographics, compared to the full cohort; (2) asked domain experts to map the SDoH questions to SDoH subdomains, for enabling a more consistent granularity; (3) used bipartite modularity maximization to identify SDoH biclusters, their significance, and their replicability; (4) measured the association of each bicluster with 3 outcomes (depression, delayed medical care, emergency room visits in the last year) using multiple data types (surveys, electronic health records, and zip codes mapped to Medicaid expansion states), and (5) asked 3 domain experts to infer the subtype labels, their mechanisms, and potential targeted interventions. Results. For Question-1, we identified 110 SDoH questions across 4 surveys, categorized into 18 SDoH subdomains covering all 5 domains in Healthy People 2030 (HP-30). However, there was a large degree of missingness in survey responses (1.76%-84.56%), with later surveys having significantly fewer responses compared to earlier ones, and significant differences in race, ethnicity, and age of participants when compared to the full cohort. For Question-2, the subtype analysis (n=12,913, d=18) identified 4 biclusters with significant biclusteredness (Q=0.13, random-Q=0.11, z=7.5, P<0.001), and significant replication (Real-RI=0.88, Random-RI=0.62, P<.001). Furthermore, there were significant associations of specific subtypes with the outcomes and with Medicaid expansion, each with meaningful interpretations and potential precision interventions. For example, the subtype Socioeconomic Barriers included the SDoH subdomains employment, food security, housing, income, literacy, and education attainment, and had a significantly higher odds ratio (OR=4.2, CI=3.5-5.1, P-corr<.001) for depression, when compared to the subtype sociocultural barriers. Individuals that match this subtype profile could be screened early for depression and referred to social services to address combinations of SDoH such as housing and income. Finally, the identified subtypes spanned one or more HP-30 domains, revealing the difference between the current knowledge-based SDoH domains, versus the data-driven subtypes, reflecting the complexity of how SDoH co-occur in the real world, and their potential use in designing interventions. Community Impact. While several SDoH models including the Dahlgren-Whitehead conceptual model have identified SDoH domains, they have emphasized that real-world SDoH span multiple domains with complex interactions and feedback loops. However, this phenomenon has been difficult to analyze given the lack of large cohorts with underrepresented populations characterized by a wide range of SDoH and datatypes. The results from analyzing SDoH using the All of Us cohort provided direct evidence for this real-world phenomenon by showing that data-driven SDoH subtypes span one or more of the SDoH domains defined by Healthy People 2030. This result provides testable hypotheses in future studies that SDoH models based on data-driven subtypes will be more accurate and interpretable for predicting adverse health outcomes, when compared to existing models that use the knowledge-driven domains. Furthermore, the characterization of the range and response to SDoH across the entire All of Us cohort using over one hundred SDoH, should enable researchers to use the approach for characterizing other cohorts for identifying and addressing missingness. Finally, our workbench which focuses on subtyping SDoH, provides generalizable and scalable machine learning methods that can be used to periodically rerun the analysis as the All of Us cohort continues to evolve.

show abstract

284 Generalizable Machine Learning Methods for Subtyping Individuals on National Health Databases: Case Studies Using Data from HRS, N3C, and All of Us

Bhavnani

Zhang²,

Bao

et al. 2023

J. Clin. Trans. Sci.

View full text Add to dashboard Cite

OBJECTIVES/GOALS: While disease subtypes are critical for precision medicine, most projects use unipartite clustering methods such as k-means which are not fully automated, do not provide statistical significance, and are difficult to interpret. These gaps were addressed through bipartite networks and tested for generalizability on three national databases. METHODS/STUDY POPULATION: Data. All participants with self-reported stroke from the 2010 Health and Retirement Study (HRS), with cases (n=798) having one or more 8 depressive symptoms measured by the Centers for the Epidemiological Study–Depression 8 scale, and controls (n=389) with none of those symptoms. The replication data set consisted of independent identically-defined participants (cases=725, controls=190) from 1998 HRS. Method. (1) Bipartite network analysis and modularity maximization to automatically identify patient-symptom biclusters with significance. (2) Rand Index to measure the replicability of symptom co-occurrences in the replication data. (3) ExplodeLayout to visualize and interpret the subtypes. (4) R libraries to generalize the methods, upload them to CRAN, and then tested on the N3C and All of Us platforms. RESULTS/ANTICIPATED RESULTS: The analysis identified 4 depressive symptom subtypes (https://postimg.cc/Ny8YwXJW) which had significant modularity (Q=0.26, z=3.03, P DISCUSSION/SIGNIFICANCE: We developed generalizable methods to automatically identify biclusters, measure the clustering significance, and visualize the results for interpretation. These methods were successfully tested on three national level data bases. Such generalizable methods should accelerate the analysis of subtypes, and the design of targeted interventions.

show abstract

341 The Impact of Critical Social Determinants of Health on Personal Medical Decisions: Analysis of Older Americans in All of Us

Bhavnani

Zhang

Bao

et al. 2023

J. Clin. Trans. Sci.

View full text Add to dashboard Cite

OBJECTIVES/GOALS: A growing number of older adults in the United States have multiple social determinants of health (SDoH) that are barriers to effective medical care. We used generalizable machine learning methods to identify and visualize subtypes based on participant-reported SDoH profiles, and their association with delayed medical care (self-reported yes/no). METHODS/STUDY POPULATION: Data. All participants aged >=65 in All of Us with complete data on 18 SDoH self-reported variables, selected through consensus by 2 experienced health services researchers, and guided by Andersen’s behavioral model. Covariates included demographics, and the outcome was delayed medical care . Cases (n=4090) consisted of participants with at least one of the 18 SDoH variables, and controls (n=7414) consisted of participants with none of them. Method. (1) Used bipartite network analysis and modularity maximization to identify participant-SDoH biclusters, and visualize them through ExplodeLayout. (2) Used multivariable logistic regression (adjusted for demographics and corrected through Bonferroni) to measure the odds ratio (OR) of each participant bicluster to the outcome, compared with the controls. RESULTS/ANTICIPATED RESULTS: The analysis identified 7 SDoH subtypes (https://postimg.cc/Vd7Pg4xZ) with statistically significant modularity compared with 100 random permutations of the data (All of Us=.51, Random Mean=.38, z=20, P DISCUSSION/SIGNIFICANCE: The results identified 7 distinct subtypes based on SDoH profiles and their risk for delayed medical care, highlighting the importance of addressing specific combinations of barriers, with affordability having the highest risk. Furthermore, the analytical methods used are generalizable and have been made publicly available on CRAN and All of Us.

show abstract

Patients with fibrosis from non-alcoholic steatohepatitis have heterogeneous intrahepatic macrophages and therapeutic targets

Saldarriaga

Krishnan

Wanninger

et al. 2023

Preprint

View full text Add to dashboard Cite

Background and Aims. In clinical trials for reducing fibrosis in NASH patients, therapeutics that target macrophages have had variable results. We evaluated intrahepatic macrophages in patients with non-alcoholic steatohepatitis to determine if fibrosis influenced phenotypes and expression of CCR2 and Galectin-3. Approach & Results. We used nCounter to analyze liver biopsies from well-matched patients with minimal (n=12) or advanced (n=12) fibrosis to determine which macrophage-related genes would be significantly different. Known therapy targets (e.g., CCR2 and Galectin-3) were significantly increased in patients with cirrhosis. However, several genes (e.g., CD68, CD16, and CD14) did not show significant differences, and CD163, a marker of pro-fibrotic macrophages was significantly decreased with cirrhosis. Next, we analyzed patients with minimal (n=6) or advanced fibrosis (n=5) using approaches that preserved hepatic architecture by multiplex-staining with anti-CD68, Mac387, CD163, CD14, and CD16. Spectral data were analyzed using deep learning/artificial intelligence to determine percentages and spatial relationships. This approach showed patients with advanced fibrosis had increased CD68+, CD16+, Mac387+, CD163+, and CD16+CD163+ populations. Interaction of CD68+ and Mac387+ populations was significantly increased in patients with cirrhosis and enrichment of these same phenotypes in individuals with minimal fibrosis correlated with poor outcomes. Evaluation of a final set of patients (n=4) also showed heterogenous expression of CD163, CCR2, Galectin-3, and Mac387, and significant differences were not dependent on fibrosis stage or NAFLD activity. Conclusions. Approaches that leave hepatic architecture intact, like multispectral imaging, may be paramount to developing effective treatments for NASH. In addition, understanding individual differences in patients may be required for optimal responses to macrophage-targeting therapies.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Daniel Bao

Subtyping Social Determinants of Health inAll of Us: Network Analysis and Visualization Approach

284 Generalizable Machine Learning Methods for Subtyping Individuals on National Health Databases: Case Studies Using Data from HRS, N3C, and All of Us

341 The Impact of Critical Social Determinants of Health on Personal Medical Decisions: Analysis of Older Americans in All of Us

Patients with fibrosis from non-alcoholic steatohepatitis have heterogeneous intrahepatic macrophages and therapeutic targets

Contact Info

Product

Resources

About