Abstract.A datatype with increasing importance in GIS is what we call the location history-a record of an entity's location in geographical space over an interval of time. This paper proposes a number of rigorously defined data structures and algorithms for analyzing and generating location histories. Stays are instances where a subject has spent some time at a single location, and destinations are clusters of stays. Using stays and destinations, we then propose two methods for modeling location histories probabilistically. Experiments show the value of these data structures, as well as the possible applications of probabilistic models of location histories.
Location information gathered from a variety of sources in the form of sensor data, video streams, human observations, and so on, is often imprecise and uncertain and needs to be represented approximately. To represent such uncertain location information, the use of a probabilistic model that captures the imprecise location as a probability density function (pdf) has been recently proposed. The pdfs can be arbitrarily complex depending on the type of application and the source of imprecision. Hence, efficiently representing, storing and querying pdfs is a very challenging task. While the current state of the art indexing approaches treat the representation and storage of pdfs as a black box, in this paper, we take the challenge of representing and storing any complex pdf in an efficient way. We further develop techniques to index such pdfs to support the efficient processing of location queries. Our extensive experiments demonstrate that our indexing techniques significantly outperform the best existing solutions.
Situational awareness (SA) applications monitor the real world and the entities therein to support tasks such as rapid decision-making, reasoning, and analysis. Raw input about unfolding events may arrive from variety of sources in the form of sensor data, video streams, human observations, and so on, from which events of interest are extracted. Location is one of the most important attributes of events, useful for a variety of SA tasks. In this paper, we propose an approach to model and represent (potentially uncertain) event locations described by human reporters in the form of free text. We analyze several types of spatial queries of interest in SA applications. Our experimental evaluation demonstrates the effectiveness of our approach.
GIS data distributed in local, state, federal, and private data clearinghouses are being made accessible through the efforts of organizations such as Federal Geographic Data Committee (FGDC) and GeoData.gov. Many database applications, such as disaster management, transportation, and national infrastructure protection, need to access GIS information from such various data sources. In this paper we study how to answer keyword-based spatial queries approximately using information from heterogeneous GIS sources. An example query specifies the region of Orange County and keywords "junior schools," which asks for geospatial objects relevant to junior schools in Orange County. The answers to such a query provided by different sources differ widely in their content and quality. It is computationally expensive to access all the datasets to retrieve all the relevant objects. We develop approximate algorithms for answering such queries based on the local analysis of the query region using space-partitioning techniques. Our methods rank datasets in a partition based on parameters such as their spatial coverage and content matching the query keywords. The quality of the answers keeps improving progressively as we do deeper local analysis. We develop an efficient traversal strategy to maximize the quality refinement within a given time limit. We conducted experiments to evaluate the proposed techniques.
Local search engines allow users to search for entities such as businesses in a particular geographic location. To improve the geographic relevance of search, user feedback data such as logged click locations are traditionally used. In this paper, we use anonymized mobile call log data as an alternate source of data and investigate its relevance to local search. Such data consists of records of anonymized mobile calls made to local businesses along with the locations of celltowers that handled the calls. We model the probability of calls made to particular categories of businesses as a function of distance, using a generalized linear model framework. We provide a detailed comparison between a click log and a mobile call log, showing its relevance to local search. We describe our probabilistic models and apply them to anonymized mobile call logs for New York City and Los Angeles restaurants.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.