Abstract. In this paper, we propose a new spatial clustering method, called DBRS+, which aims to cluster spatial data in the presence of both obstacles and facilitators. It can handle datasets with intersected obstacles and facilitators. Without preprocessing, DBRS+ processes constraints during clustering. It can find clusters with arbitrary shapes and varying densities. DBRS+ has been empirically evaluated using synthetic and real data sets and its performance has been compared to DBRS, AUTOCLUST+, and DBCLuC*.+ntroduction Dealing with constraints due to obstacles and facilitators is an important topic in constraint-based spatial clustering. An obstacle is a physical object that obstructs the reachability among the data objects, and a facilitator is also a physical object that connects distant data objects or connects data objects across obstacles. Handling these constraints can lead to effective and fruitful data mining by capturing application semantics 69. We will illustrate some constraints by the following example.Suppose a real estate company wants to identify optimal shopping mall locations for western Canada, shown in the map in Figure 1. In the map, a small oval represents a minimum number of residences. Each river, represented with a light polyline, acts as an obstacle that separates residences on its two sides. The dark lines represent the highways, which could shorten the traveling time. Since obstacles exist in the area and they should not be ignored, the simple Euclidean distances among the objects are not appropriate for measuring user convenience when planning locations of shopping malls. Similarly, since traveling on highways is faster than in urban centers, the length of the highways should be shortened for this analysis. Ignoring the role of such obstacles (rivers) and facilitators (highways for driving) when performing clustering may lead to distorted or useless results.In this paper, we extend the density-based clustering method DBRS 11 to handle obstacles and facilitators and call the extended method DBRS+. The contributions of DBRS+ are: first, it can handle both obstacles, such as fences, rivers, and highways (when walking), and facilitators, such as bridges, tunnels, and highways (when driving), which exist in the data. Both obstacles and facilitators are modeled as polygons. Most previous research can only handle obstacles. Second, DBRS+ can handle any combination of intersecting obstacles and facilitators. None of previous methods consider intersecting obstacles, which are common in real data. For example, highways or rivers often cross each other and bridges and tunnels often cross rivers. Although the
We investigate the design and implementation of a parallel workflow environment targeted towards the financial industry. The system performs real-time correlation analysis and clustering to identify trends within streaming high-frequency intra-day trading data. Our system utilizes state-of-the-art methods to optimize the delivery of computationally-expensive real-time stock market data analysis, with direct applications in automated/algorithmic trading as well as knowledge discovery in high-throughput electronic exchanges. This paper describes the design of the system including the key online parallel algorithms for robust correlation calculation and clique-based clustering using stochastic local search. We evaluate the performance and scalability of the system, followed by a preliminary analysis of the results using data from the Toronto Stock Exchange.
We describe the design of a lightweight library using MPI to support stream-processing on acyclic process structures. The design can be used to connect together arbitrary modules where each module can be its own parallel MPI program. We make extensive use of MPI groups and communicators to increase the flexibility of the library, and to make the library easier and safer to use. The notion of a communication context in MPI ensures that libraries do not conflict where a message from one library is mistakenly received by another. The library is not required to be part of any larger workflow environment and is compatible with existing MPI execution environments. The library is part of MarketMiner, a system for executing financial workflows.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.