2002
DOI: 10.1590/s1415-790x2002000200006
|View full text |Cite
|
Sign up to set email alerts
|

Avaliação de diferentes estratégias de blocagem no relacionamento probabilístico de registros

Abstract: A blocagem (blocking), que consiste na criação de blocos lógicos de registros dentro de arquivos a serem relacionados, é um dos processos que faz parte do relacionamento probabilístico de grandes bases de dados. Os objetivos deste trabalho são comparar a eficiência de diferentes esquemas de blocagem e estudar a eficiência da utilização de uma rotina de padronização desenvolvida pelos autores, que aplica a mesma grafia para as primeiras sílabas de nomes com o mesmo som. Procedemos ao relacionamento de uma base … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
23
0
20

Year Published

2010
2010
2018
2018

Publication Types

Select...
8

Relationship

1
7

Authors

Journals

citations
Cited by 51 publications
(43 citation statements)
references
References 12 publications
(19 reference statements)
0
23
0
20
Order By: Relevance
“…With blocking, comparisons are limited to records from the same block, or, in other words, to those records that have the equal values for all variables contained in each blocking key, which reduces the total number of pairs formed at each step and increases the probability of the formation of true pairs. 5 Blocking reduces the number of comparisons made by computer and memory and processor usage, which results in lower processing costs and makes the process faster and less demanding with respect to the computing hardware. A very large number of pairs could make clerical review too complex to warrant doing.…”
Section: Methodsmentioning
confidence: 99%
See 2 more Smart Citations
“…With blocking, comparisons are limited to records from the same block, or, in other words, to those records that have the equal values for all variables contained in each blocking key, which reduces the total number of pairs formed at each step and increases the probability of the formation of true pairs. 5 Blocking reduces the number of comparisons made by computer and memory and processor usage, which results in lower processing costs and makes the process faster and less demanding with respect to the computing hardware. A very large number of pairs could make clerical review too complex to warrant doing.…”
Section: Methodsmentioning
confidence: 99%
“…A five-step blocking strategy was used based on the combination of the following fi elds: fi rst name Soundex code (modifi ed); last name Soundex code (modifi ed), sex, and year of birth. 5 Next, we performed the matching, in which pairs were formed (an SIH record with a SIM record) from the comparison of previously selected variables. To this end, we used the variables "full name" and "birth date".…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…Ao mesmo tempo, o SUS possui um conjunto de registros informatizados que permite conhecer, desde o momento do transplante, as principais despesas com essas terapêuticas: o Sistema de Informações Hospitalares (SIH) e ambulatorial (APAC/SIA) (http:// www.datasus.gov.br). Essas bases, em conjunto com o Sistema de Informações sobre Mortalidade (SIM) e outros dados relacionados de forma probabilística 7 , conformaram a Base de Dados Nacional em terapias de substituição renal (http://www.datasus.gov.br)/(http://www.bpre co.saude.gov.br/bprefd/owa/consulta.inicio) 8,9 .…”
Section: Introductionunclassified
“…7 . Para comparação automática dos registros entre as bases de dados foram usadas as variáveis nome, sexo e data de nascimento; as variáveis nome da mãe e endereço foram utilizadas para comparação visual na classificação dos links duvidosos ao final de cada passo de blocagem.…”
unclassified