During the 19th and early 20th century about 220,000 Dutch born persons migrated to the USA. The Historical Sample of the Netherlands (HSN) contains about 85,500 persons born in the Netherlands between 1812 and 1922. In this article we report the way we have matched persons from the HSN with the American censuses from the period 1850 till 1940. For this purpose, a linking process was designed, comprising of three stages: harmonization, matching and validation. The different nature of the two datasets (HSN and the USA Censuses) asked for some harmonization prior to the matching. Once the data had been properly prepared, two strategies were applied in order to link the data sets. The first one, called Similarity Approach, matched individuals from both datasets by comparing on the basis of resemblance of first and last names. The second approach, called Transformation Approach, made use of dictionaries with Anglicized versions of Dutch first and last names and their most common or most likely Dutch original(s). Because of the sample character of the HSN even exact matches showed ambiguity that needs to be resolved. For this reason, a validation process comparing the household context was run to provide a more trustworthy result. In the end we identified 484 individuals present in the HSN database with reliable links to the American censuses. We also evaluated the result in the light of what we know from emigration patterns to the USA over time and period and we concluded that our efforts have produced a reasonable result. Nevertheless, we are aware that we may have missed links. We also found that at least 45% of the emigrants returned to the Netherlands at some point during their life course.
The Antwerp COR*-IDS database 2020 is a transformed and harmonized historical demographic database in a cross-nationally comparable format designed to be open and easy to use for international researchers. The database is constructed from the 2010 release of the Antwerp COR*-historical demographic database, which was created using a letter sample of the whole district of Antwerp (Flanders, Belgium). It has a total sample size of +/- 33,000 residents of Antwerp. The sample spans nearly seven decades. The data is collected from historical records: including population registers and vital registration records covering births, marriages, in/external migrations and deaths. The database covers up to three linked generations (in some cases more), and contains micro-data on individual level life courses, and relationships deriving from addressbased household composition methods. An important characteristic is the sample's large migrant population, including the timings of their demographic events and living arrangements, whilst resident in the district of Antwerp. In addition, the sample also contains a large array of occupational level information. This paper presents the processes, methodologies and documentation regarding the evaluation and development of a pre-existing historical database. This includes the systematic evaluation of the original samples, methodologies for address based reconstructing of households, and the geocoding of a historical database which took place during the current development of this new version of the database.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.