Leonardo Dias scite author profile

Leonardo Dias

2Publications

15Citation Statements Received

21Citation Statements Given

How they've been cited

How they cite others

Affiliations

Universidade Federal de Minas Gerais

Publications

Order By: Most citations

Building the National Database of Health Centred on the Individual: Administrative and Epidemiological Record Linkage - Brazil, 2000-2015

Guerra

Pereira

Gurgel

et al. 2018

IJPDS

View full text Add to dashboard Cite

IntroducationIn Brazil, the National Health System (SUS) provides healthcare to the public. The system has multiple administrative databases; the major databases record hospital (SIH) and outpatient (SIA) procedures. Epidemiological information is collected for all populations in subsystems, such as mortality (SIM), live births (SINASC) and diseases of compulsory declaration (SINAN). Each subsystem has its own information system, which is able to provide information about consultations, clinical information and medicines dispensed. However, these systems are not linked, thereby preventing individual-centred analysis. ObjectiveTo describe the methods and results of parameter setting that are needed to execute the probabilistic deduplication of large administrative and epidemiological databases in Brazil and to create a National Health Database Centred on the individual. MethodsThis paper shows the results of a record linkage model to integrate data from SIH, SIA, SIM, and SINAN, which have different formats and attributes between them and over time. These data consist of 1.3 billion records from 2000-2015. Probabilistic and deterministic record linkages were used to deduplicate these data. The Kappa statistic and clerical review were used to ensure the quality of the linkage. The graph algorithm and depth-first search were used to generate the identifiers. ResultsThe deterministic deduplication process resulted in a database with 403,113,527 possible unique individuals. After the probabilistic deduplication process of the former database was performed, 159,703,805 unique individuals were identified. This result had an estimated a false positive error rate of 3.3%, and the false negative error was estimated at 12.3%. ConclusionsThe National Health Database centred on the individual was generated and will allow researchers to use real-world evidence to conduct clinical, epidemiological, economic and other studies. This database represents a significant cohort, spanning 15 years of historical data and preserving patient privacy. The success of the process described will allow repeating and appending the data for future years and enable important studies to promote SUS efficiency and provide better treatments for patients.

show abstract

Unified health database creation: 125 million brazilian cohort from information systems of hospital, outpatient, births, notifications and mortalities

Pereira¹,

Dias²,

Ávila³

et al. 2017

IJPDS

View full text Add to dashboard Cite

MethodsSemantic analysis of data was performed to describe and understand different meanings of different fields existing in the studied bases. In addition, there were four main procedures, executed with database operations tools and PLSQL programming language: cleaning and standardization of databases(document's numbers was checked in the brazilian national people's database, with a string approximator algorithm to decide if the document's number belonged or no the register); registration information extraction, deterministic and probabilistic deduplication thereof. The procedures were first performed on each database separately and after the unification of the records, was held again a deterministic deduplication. Except the probabilistic deduplication which was performed only on the final deterministic deduplicated's database.Performed procedures allowed a decision-making to chose fields used in data model for the unified database creation. Nine database's representative fields related to patients were selected: patient's name; patient mother's name; sex; birth date; state; city; zip code; cpf and cns(brazilian documents). ResultsInitially, the unified registration database resulted in 705.599.785 records, after deterministic deduplication there was a reduction culminating in 198.400.762 records. This reduction is explained because these databases are not fully integrated. Moreover, there is not always agreement between systems' semantics and in some cases changes occur in the data format over the period within the same system. After probabilistic deduplication, the number of unique records decreased to 124.545.186 which is explained by non-linked pairs by deterministic process. This result is guaranteed with a estimate error of at most 3.3% of false positive and at most 12.3% of false negative pairs. ConclusionThe results show that data deduplication is necessary and should be carried out thoroughly. Where the databases had limited patients' registration information, the technique enabled to capture, in more complete basis, additional information. Futhermore, it allowed to identify and assist in the understanding of positive and negative aspects within systems and trace clinical condition of patients, enabling pharmacoeconomic and epidemiological studies that define effectiveness and efficiency of public policies and embedded technologies. As future work, is important ensure the univocity of records and link this database with past period.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Leonardo Dias

Building the National Database of Health Centred on the Individual: Administrative and Epidemiological Record Linkage - Brazil, 2000-2015

Unified health database creation: 125 million brazilian cohort from information systems of hospital, outpatient, births, notifications and mortalities

Contact Info

Product

Resources

About