2011
DOI: 10.1111/j.1467-9671.2011.01274.x
|View full text |Cite
|
Sign up to set email alerts
|

The Development of a Web‐based Demographic Data Extraction Tool for Population Monitoring

Abstract: The Internet contains a great wealth of information available online. People search engines, such as WhitePages (http://www.whitepages.com), gather personal‐level demographic data, including full name, address, age and household members. Requiring only a surname and locational reference (e.g. city or postal code) as the minimum search criteria, such people search engines can be perceived as a gigantic database of demographic records. The objective of this article is to outline the development of a web‐based de… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
8
0

Year Published

2012
2012
2021
2021

Publication Types

Select...
5
2

Relationship

1
6

Authors

Journals

citations
Cited by 11 publications
(8 citation statements)
references
References 28 publications
0
8
0
Order By: Relevance
“…[20212228] Self-identification of ethnicity carries risks of preconceived ethnic group classification, so it is important to remember how this data is used in the context of reporting healthcare-related data. [41838]…”
Section: Discussionmentioning
confidence: 99%
“…[20212228] Self-identification of ethnicity carries risks of preconceived ethnic group classification, so it is important to remember how this data is used in the context of reporting healthcare-related data. [41838]…”
Section: Discussionmentioning
confidence: 99%
“…For a valid record of web demographics, the mandatory demographic attributes include a fi rst name or fi rst initial, a valid surname, and a "geocodable" address ( Word et al, n.d. ;Chow et al 2011 ) . It is not uncommon to have incomplete data that may consist of records without a full name, a name without address, or the other way around.…”
Section: Processingmentioning
confidence: 93%
“…The basic idea is to parse the HyperText Markup Language (HTML) document and analyze the hierarchical tags embedded within the content structured in various degrees. Chow et al ( 2011 ) outlined a fi ve-step framework to automate the extraction of web demographics based on surname analysis. The process can be extended from a prede fi ned HTML schema to statistical clustering, machine-learning algorithms, data-mining techniques, or ontology-based relationships to better re fi ne the extraction and navigation rules (Chang et al 2006 ) .…”
Section: Acquisitionmentioning
confidence: 99%
“…An individual with different addresses would indicate a person who moved during this time. The web-based data extraction tool developed by Chow et al (2011) did not capture the information concerning when a person's demographic information was recorded, if it existed at all. Due to the absence of this temporal information, the process of determining which records received a migration label of 'Move From' or 'Move To' operated under three assumptions.…”
Section: Data Processingmentioning
confidence: 99%
“…Of that population, 4.2%, or roughly 1 million, were estimated to be of Asian ethnicity (U.S. Census Bureau, 2014). Using a web-based data extraction tool that leveraged a list of popular Vietnamese surnames, Chow et al (2011) acquired demographic data for 40.3% of the estimated Vietnamese-American (VA) population in Texas as reported by the American Community Survey 2009. Chow et al (2012) used this data set to model population change during the period between 2000 and 2009 at both the county and census tract levels.…”
Section: Introductionmentioning
confidence: 99%