2019
DOI: 10.2478/ijasitels-2019-0007
|View full text |Cite
|
Sign up to set email alerts
|

DBSCAN Algorithm for Document Clustering

Abstract: Document clustering is a problem of automatically grouping similar document into categories based on some similarity metrics. Almost all available data, usually on the web, are unclassified so we need powerful clustering algorithms that work with these types of data. All common search engines return a list of pages relevant to the user query. This list needs to be generated fast and as correct as possible. For this type of problems, because the web pages are unclassified, we need powerful clustering algorithms… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
2
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(2 citation statements)
references
References 1 publication
0
2
0
Order By: Relevance
“…This approach overcomes limitations in traditional document clustering methods by leveraging densitybased clustering to identify clusters of varying shapes and sizes. Liu and Yang (2022) [30] Limitations of DBSCAN: While DBSCAN presents a promising approach in the context of text clustering and document classification, it is not exempt from certain limitations that deserve consideration. Despite its capacity to uncover clusters of varying shapes and sizes based on density connectivity, this method's performance can be sensitive to parameter settings.…”
Section: Dbscanmentioning
confidence: 99%
“…This approach overcomes limitations in traditional document clustering methods by leveraging densitybased clustering to identify clusters of varying shapes and sizes. Liu and Yang (2022) [30] Limitations of DBSCAN: While DBSCAN presents a promising approach in the context of text clustering and document classification, it is not exempt from certain limitations that deserve consideration. Despite its capacity to uncover clusters of varying shapes and sizes based on density connectivity, this method's performance can be sensitive to parameter settings.…”
Section: Dbscanmentioning
confidence: 99%
“…Additionally, the topic modeling feature enables continued updates of topic models as new ideas emerge during brainstorming sessions. The use of algorithms such as Density-Based Spatial Clustering of Applications with Noise (DBSCAN) [20,21], which is well known for its ability to cluster data without needing the number of clusters as input, improves clustering within topic modeling with BERTopic, further enhancing the exploration and comprehension of brainstorming concepts. Furthermore, BERTopic provides a range of visualization tools, including topic hierarchies and inter-topic distance maps, which facilitate clearer insights and decision-making processes.…”
mentioning
confidence: 99%