2016
DOI: 10.1016/j.procs.2016.05.389
|View full text |Cite
|
Sign up to set email alerts
|

Hierarchical Density-Based Clustering Based on GPU Accelerated Data Indexing Strategy

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
9
0

Year Published

2017
2017
2022
2022

Publication Types

Select...
8
1

Relationship

0
9

Authors

Journals

citations
Cited by 10 publications
(9 citation statements)
references
References 12 publications
0
9
0
Order By: Relevance
“…A parallel computing environment has become the first choice for solving big data processing problems. Some researchers have proposed parallel algorithms based on multithreading [ 18 ]. Although the pressure of storage and calculation has been relieved to a great extent, the limitation of memory resources has become the algorithm of the bottleneck of expansion; the FP-Growth algorithm is based on the Apriori principle.…”
Section: Related Theoriesmentioning
confidence: 99%
“…A parallel computing environment has become the first choice for solving big data processing problems. Some researchers have proposed parallel algorithms based on multithreading [ 18 ]. Although the pressure of storage and calculation has been relieved to a great extent, the limitation of memory resources has become the algorithm of the bottleneck of expansion; the FP-Growth algorithm is based on the Apriori principle.…”
Section: Related Theoriesmentioning
confidence: 99%
“…These low-volume clusters often contain valuable information, which might not even be known to medical experts: their low volume makes it difficult to detect them via manual inspection. To perform clustering in the embedding space, we use the hierarchical, density-based clustering algorithm HDBSCAN (Campello et al, 2013;Melo et al, 2016;McInnes et al, 2017). As customary in unsupervised learning tasks, one needs to provide some information on the desired granularity, i.e.…”
Section: Clustering Similar Medical Inquiries Via Hierarchical Clusteringmentioning
confidence: 99%
“…Here, we propose an approach -schematically depicted in Figure 1 -to discover topics from short, unstructured, real-world medical inquiries. Our methodology consists of the following steps: medical inquiries are preprocessed (via lemmatization, stopword removal) and converted to vectors via a biomedical word embedding (scispacy (Neumann et al, 2019)), a dimensionality reduction is then applied to lower the dimensionality of the embedded vectors (via UMAP (McInnes et al, 2018a;McInnes et al, 2018b)), clustering is performed in this lower dimensional space to group together similar inquiries (via HDBSCAN (Campello et al, 2013;Melo et al, 2016;McInnes et al, 2017)). These clusters of similar inquiries are then merged based on semantic similarity: we define these (merged) clusters as topics.…”
Section: Introductionmentioning
confidence: 99%
“…One of the outputs that this text mining process produces is a semantic tree which can be explored interactively on the PoliRural platform (see PoliRural Innovation Hub section). We will use ANNOY and HDBSCAN for clustering (Melo et al, 2016) and novel Word Mover's Distance for sentence and paragraph similarity analysis (Ye et al, 2016).…”
Section: Text Mining Enabled Policy Evaluationmentioning
confidence: 99%