Abstract.A variety of gazetteers exist based on administrative or user contributed data. Each of these data sources has benefits for particular geographical analysis and information retrieval tasks but none is a one fit all solution. We present a mediation framework to access and integrate distributed gazetteer resources to build a meta-gazetteer that generates augmented versions of place name information. The approach combines different aspects of place name data from multiple gazetteer sources that refer to the same geographic place and employs several similarity metrics to identify equivalent toponyms.
Vernacular place names are names that are commonly in use to refer to geographical places. For purposes of effective information retrieval, the spatial extent associated with these names should be able to reflect people's perception of the place, even though this may differ sometimes from the administrative definition of the same place name. Due to their informal nature, vernacular place names are hard to capture, but methods to acquire and define vernacular place names are of great benefit to search engines and all kind of information services that deal with geographic data. This paper discusses the acquisition of vernacular use of place names from web sources and their representation as surface models derived by kernel density estimators.
In this paper, we describe a methodology to estimate the geographic coverage of the web without the need for secondary knowledge or complex geo-tagging. This is achieved by randomly selecting toponyms from the Ordnance Survey 50K gazetteer to create search queries and thus gather document counts from various web sources for Great Britain. The same gazetteer is then used to geo-code the results and enable mapping. To validate our approach, and demonstrate the effects of geo/non-geo and geo/geo ambiguity, we mapped the selected toponyms to Geograph, a community project that contains user generated geo-tagged photographs of the UK. Although success varies with resolution, the proposed approach is likely sufficient to be reliably used by applications exploring the geographic coverage of the web for cases where references to settlements are likely to be common. In our case, we applied the method to produce maps of web coverage for a range of sources at a resolution of 30km.
Spatial information takes different forms in different applications, ranging from accurate coordinates in geographic information systems to the qualitative abstractions that are used in artificial intelligence and spatial cognition. As a result, existing spatial information processing techniques tend to be tailored towards one type of spatial information, and cannot readily be extended to cope with the heterogeneity of spatial information that often arises in practice. In applications such as geographic information retrieval, on the other hand, approximate boundaries of spatial regions need to be constructed, using whatever spatial information that can be obtained. Motivated by this observation, we propose a novel methodology for generating spatial scenarios that are compatible with available knowledge. By suitably discretizing space, this task is translated to a combinatorial optimization problem, which is solved using a hybridization of two well-known meta-heuristics: genetic algorithms and ant colony optimization. What results is a flexible method that can cope with both quantitative and qualitative information, and can easily be adapted to the specific needs of specific applications. Experiments with geographic data demonstrate the potential of the approach.
Vernacular place names are names that are commonly in use to refer to geographical places. For purposes of effective information retrieval, the spatial extent associated with these names should reflect peoples perception of the place, even though this may differ sometimes from the administrative definition of the same place name. Due to their informal nature, vernacular place names are hard to capture, but methods to acquire and define vernacular place names are of great benefit to search engines and all kinds of information services that deal with geographic data. This paper discusses the acquisition of vernacular use of place names from web sources and their representation as surface models derived by kernel density estimators. We show that various web sources containing user created geographic information and business data can be used to represent neighbourhoods in Cardiff, UK. The resulting representations can differ in their spatial extent from administrative definitions. The chapter closes with an outlook on future research questions.
People often communicate with reference to informally agreed places, such as 'the city centre'. However, views of the spatial extent of such areas may vary and result in imprecise regions. We compare perceptions of Sheffield's City Centre from a street survey (with 61 participants) to spatial extents derived from various web-based sources. Such automated approaches have advantages of speed, cost and repeatability. Our results show that footprints derived from web sources are often in concordance with models derived from more labourintensive methods. There were, however, differences between some of the data sources, with those advertising/selling residential property diverging the most from the street survey data. Agreement between sources was measured by aggregating the web sources to identify locations of consensus.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.