2017 IEEE 30th International Symposium on Computer-Based Medical Systems (CBMS) 2017
DOI: 10.1109/cbms.2017.64
|View full text |Cite
|
Sign up to set email alerts
|

Probabilistic Integration of Large Brazilian Socioeconomic and Clinical Databases

Abstract: The integration of disparate large and heterogeneous socioeconomic and clinical databases is now considered essential to capture and model longitudinal and social aspects of diseases. However, such integration has significant challenges associated with it. Databases are often stored in disparate locations, make use of different identifiers, have variable data quality, record information in bespoke purpose-specific formats and have different levels of associated metadata. Novel computational methods are require… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
6

Relationship

0
6

Authors

Journals

citations
Cited by 6 publications
(5 citation statements)
references
References 14 publications
0
5
0
Order By: Relevance
“…Demographic and socioeconomic adjusting variables showed an overall protective effect of being a woman, having a higher level of education, and having a higher income per capita, and an increased risk among Black people, older people, and those living in households built with precarious materials (table 3). The associations for all adjusting variables for the models in table 4 are shown in the appendix 2 (pp [15][16][17][18].…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…Demographic and socioeconomic adjusting variables showed an overall protective effect of being a woman, having a higher level of education, and having a higher income per capita, and an increased risk among Black people, older people, and those living in households built with precarious materials (table 3). The associations for all adjusting variables for the models in table 4 are shown in the appendix 2 (pp [15][16][17][18].…”
Section: Resultsmentioning
confidence: 99%
“…Algorithms and codes were developed to make efficient, highly sensitive, and specific linkages using the name of each individual, date of birth, sex, and municipality of residence present in each of these data systems. 11,14,15 The 100 Million Brazilians Cohort baseline and SINAN datasets were linked by five individual-level identifiers in two steps using the CIDACS-record linkage tool. In the first step, entries were deterministically linked.…”
Section: Implications Of All the Available Evidencementioning
confidence: 99%
“…There are some techniques to mitigate data incompleteness in LBSM. For instance, Pinto et al [23] proposed a record linkage approach to enrich incomplete data. Dubois and Prade [24] and Yagger [25] used possibility theory and the probability of fuzzy events to handle imperfect data.…”
Section: Lbsm Data Aspectsmentioning
confidence: 99%
“…Field level Bloom filters encode each identifier into a separate Bloom filter [14]. Record linkage techniques (deterministic and probabilistic) can then be used to link records in much the same way as with unencrypted identifiers [15][16][17]. Record level (or composite) Bloom filters encode two or more identifiers into a single Bloom filter [5,18].…”
Section: Introductionmentioning
confidence: 99%
“…Probabilistic record linkage is preferred by many data linkage centres due to its proven track record of producing high quality linkage results from unencrypted identifiers [21][22][23]. It has been shown to produce equally good results when applied to Bloom filters [1,15,16]. An extension to the basic probabilistic model of record linkage allows for approximate matches between fields.…”
Section: Introductionmentioning
confidence: 99%