On the Hardness and Approximation of Euclidean DBSCAN

Gan, Junhao; Tao, Yufei

doi:10.1145/3083897

Cited by 44 publications

(44 citation statements)

References 29 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…One of the most widespread agglomeration methods, i.e. when two clusters are similar enough to be combined, is the Ward's method, while the Euclidean distance is the most common, standard distance measure for continuous data (Gan & Tao, 2017). In conclusion, the agglomeration allowed not only distinguishing the categories of business models and their individual types but also tracing the evolution of business models in the analysed sector.…”

Section: Methodsmentioning

confidence: 98%

The evolution of business models of Polish health resort enterprises

Kozarkiewicz¹,

Kabalska²

2020

JEM

View full text Add to dashboard Cite

Aim/purpose-The aim of the paper is to explore the evolution of business models of health resort enterprises. The sector is perceived in this research as a case of a particular, traditional sector based on natural (here: balneological) resources which has been undergoing significant changes. In addition to the analysis of the evolution of business models, the purpose was to investigate the role of different categories of resources, e.g. infrastructure or relational competences, in substituting natural resources in the creation of business models. Design/methodology/approach-The exploration was based on a quantitative approach and survey research. The data gathered through the questionnaire were used in cluster analysis which adopted the agglomeration (hierarchical) method, i.e. grouping of features by Ward's method. Findings-Business models in the health resort sector in Poland have evolved and the major change incorporated their shift from business models focused on an idiosyncratic category of resources, i.e. natural, balneological resources, to business models based on other categories, like human resources, e.g. employees' competences, tangible resources like modern infrastructure or financial resources (for example contracts). The application of a clustering method facilitated tracing the evolution of business models from various perspectives, such as the importance of interorganiational cooperation, different value propositions, and focus on various customer segments. The evolution of business models of Polish health resort enterprises 63 Research implications/limitations-The research implication for management studies is the recognition and presentation of the categories (archetypes) of business models of health resort enterprises in Poland as well as the characteristics of the evolution of business models within the scope of their components. The primary practical implication from a managerial perspective is to provide the basis for evaluation of opportunities and threats related to the adopted business model when comparing to the alternatives currently chosen by competitors. The major limitation of the research is the small sample size. Originality/value/contribution-The paper presents the results of the original research conducted among health resort enterprises in Poland. The paper identifies and characterises various categories (archetypes) of business models of the health resort enterprises in Poland as well as the evolution of the building blocks of implemented business models. The study contributes to the discussion on the usefulness of resource-based-view (RBV) as a theoretical perspective of business model research. The results are consistent with RBV as well as adhere to competence approach and relational view. From a practical perspective, they offer a recipe (formula) for managers of health resort enterprises interested in introducing changes into their business models.

show abstract

Section: Methodsmentioning

confidence: 98%

The evolution of business models of Polish health resort enterprises

Kozarkiewicz¹,

Kabalska²

2020

JEM

View full text Add to dashboard Cite

show abstract

“…If a non-empty cell contains at least MinPts points, the cell is called a core cell; moreover, because the maximum distance in the cell is ε, all points in the cell are core points, so it is not necessary to calculate the density of each point in the core cell. Based on fast DBSCAN algorithm, Gan and Tao proposed ρ-approximate DBSCAN [23] algorithm. The algorithm achieves an excellent complexity of O(n) in low dimension.…”

Section: Related Workmentioning

confidence: 99%

An Improved DBSCAN Algorithm Based on the Neighbor Similarity and Fast Nearest Neighbor Query

2020

IEEE Access

View full text Add to dashboard Cite

DBSCAN is the most famous density based clustering algorithm which is one of the main clustering paradigms. However, there are many redundant distance computations among the process of DBSCAN clustering, due to brute force Range-Query used to retrieve neighbors for each point in DBSCAN, which yields high complexity (O(n 2)) and low efficiency. Thus, it is unsuitable and not applicable for large scale data. In this paper, an improved DBSCAN based on neighbor similarity is proposed, which utilizes and Cover Tree to retrieve neighbors for each point in parallel, and the triangle inequality to filter many unnecessary distance computations. From the experiments conducted on large scale data sets, it is demonstrated that the proposed algorithm greatly speedup the original DBSCAN, and outperform the main improvements of DBSCAN. Comparing with ρ-approximate DBSCAN, which is the current fastest but approximate version of DBSCAN, the proposed algorithm has two advantages: one is faster and the other is that the result is accurate. INDEX TERMS Clustering, DBSCAN, neighbor similarity, cover tree.

show abstract

“…The runtime complexity of a single range query when using a sequential scan is O ( N ), resulting in a total runtime complexity of

O (N^{2} + \underset{i}{false\sum} r_{i})

in the worst case. In many practical applications, however, by using suitable index structures such as R*‐trees, range queries can be evaluated much faster than by using a sequential scan (Gan & Tao, ; Schubert, Sander, Ester, Kriegel, & Xu, ).…”

Section: Classic Algorithms For Flat Density‐based Clusteringmentioning

confidence: 99%

“…The runtime complexity of a single range query when using a sequential scan is O(N), resulting in a total runtime complexity of OðN 2 + P i r i Þ in the worst case. In many practical applications, however, by using suitable index structures such as R*-trees, range queries can be evaluated much faster than by using a sequential scan (Gan & Tao, 2017;Schubert, Sander, Ester, Kriegel, & Xu, 2017). GDBSCAN is an algorithmic framework that generalizes the notion of density-based clusters to the concept of density-connected decomposition for any type of data.…”

Section: Classic Algorithms For Flat Density-based Clusteringmentioning

confidence: 99%

See 1 more Smart Citation

Density‐based clustering

Campello

Kröger

Sander

et al. 2019

WIREs Data Min & Knowl

View full text Add to dashboard Cite

Clustering refers to the task of identifying groups or clusters in a data set. In density‐based clustering, a cluster is a set of data objects spread in the data space over a contiguous region of high density of objects. Density‐based clusters are separated from each other by contiguous regions of low density of objects. Data objects located in low‐density regions are typically considered noise or outliers. In this review article we discuss the statistical notion of density‐based clusters, classic algorithms for deriving a flat partitioning of density‐based clusters, methods for hierarchical density‐based clustering, and methods for semi‐supervised clustering. We conclude with some open challenges related to density‐based clustering. This article is categorized under: Technologies > Data Preprocessing Ensemble Methods > Structure Discovery Algorithmic Development > Hierarchies and Trees

show abstract

On the Hardness and Approximation of Euclidean DBSCAN

Cited by 44 publications

References 29 publications

The evolution of business models of Polish health resort enterprises

The evolution of business models of Polish health resort enterprises

An Improved DBSCAN Algorithm Based on the Neighbor Similarity and Fast Nearest Neighbor Query

Density‐based clustering

Contact Info

Product

Resources

About