Proceedings of the ACM Conference on Health, Inference, and Learning 2020
DOI: 10.1145/3368555.3384451
|View full text |Cite
|
Sign up to set email alerts
|

Population-aware hierarchical bayesian domain adaptation via multi-component invariant learning

Abstract: While machine learning is rapidly being developed and deployed in health settings such as influenza prediction, there are critical challenges in using data from one environment in another due to variability in features; even within disease labels there can be differences (e.g. "fever" may mean something different reported in a doctor's office versus in an online app). Moreover, models are often built on passive, observational data which contain different distributions of population subgroups (e.g. men or women… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
8
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
2

Relationship

2
4

Authors

Journals

citations
Cited by 6 publications
(8 citation statements)
references
References 20 publications
(11 reference statements)
0
8
0
Order By: Relevance
“…The often-linked geographic information and time stamps further enable the capture of hyper-local, daily and sub-daily health-related information from behaviours to exposures and other macro-level properties, as well as health outcomes 23,24 . Accordingly, such ubiquitous technologies can provide opportunities to better measure the social determinants of health in a targeted way, by person, location and/or time [21][22][23][24][25] . These attributes of person-generated data can complement denominator-based survey and report data that are not available at such high granularity.…”
Section: Machine Learning and Algorithmic Fairness In Public And Population Healthmentioning
confidence: 99%
See 1 more Smart Citation
“…The often-linked geographic information and time stamps further enable the capture of hyper-local, daily and sub-daily health-related information from behaviours to exposures and other macro-level properties, as well as health outcomes 23,24 . Accordingly, such ubiquitous technologies can provide opportunities to better measure the social determinants of health in a targeted way, by person, location and/or time [21][22][23][24][25] . These attributes of person-generated data can complement denominator-based survey and report data that are not available at such high granularity.…”
Section: Machine Learning and Algorithmic Fairness In Public And Population Healthmentioning
confidence: 99%
“…Domain adaptation methods can be developed to address distribution shifts that may occur across different environment, both based on different populations and data generation mechanisms. 25,63,64 In particular, thinking about data distribution shifts and differences from a causal perspective has been utilized to inform the empirical learning processes 23,65 .…”
Section: Current Challengesmentioning
confidence: 99%
“…The domain adaptation problem refers to the situation where a statistical learning model trained on one labeled dataset needs to be generalized to the target dataset, or target domain, drawn from a different distribution and with insufficient labeled data (Daume III and Marcu, 2006). Learning from data collected in different domains is an active area of research in computer science and has been explored in various applications including natural language processing (Ramponi andPlank, 2020), visual classification (Wang andDeng, 2018), sentiment prediction (Glorot et al, 2011), and more recently in prediction problems in public health and clinical settings (Rehman et al, 2018;Mhasawade et al, 2020;Laparra et al, 2020).…”
Section: Domain Adaption In Cause-of-death Assignmentmentioning
confidence: 99%
“…For example, recent work on how individual-level syndromic reports or passive data from individuals relates to microbiologic confirmation [14], aims to address early challenges highlighted in using data such as Google searches to predict influenza [29]. Research on the data generating process of new digital data sources is also imperative to understand what the data represents, and why data is being shared by individuals [37]. Second, can the results be extended to other situations, groups, or events; which sub-populations are representative of the hypothesis [38]?…”
Section: Measuring Social Determinants Of Healthmentioning
confidence: 99%
“…Even if similar syndromic data is collected across all the approaches, there are considerable differences in the predictive value of the syndromic data [12], highlighting that the same type of data can represent different factors across studies (in this case the mode of data collection can affect the specificity/sensitivity of respiratory infection syndromic data). Although there are significant differences in sample representation and predictivity across studies, it is crucial to understand external validity as it provides a way to understand populationlevel characteristics that remain invariant across studies [11,37].…”
Section: Measuring Social Determinants Of Healthmentioning
confidence: 99%