Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining 2023
DOI: 10.1145/3580305.3599314
|View full text |Cite
|
Sign up to set email alerts
|

DotHash: Estimating Set Similarity Metrics for Link Prediction and Document Deduplication

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
5
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(5 citation statements)
references
References 40 publications
0
5
0
Order By: Relevance
“…In this work, we present HyperGen: a genome sketching tool based on hyperdimensional computing (HDC) (Nunes et al ., 2023; Kanerva, 2009) that improves accuracy, runtime performance, and memory efficiency for large-scale genomic analysis. HyperGen inherits the advantages of both FracMinHash-based sketching (Hera et al ., 2023) and DotHash (Nunes et al ., 2023). HyperGen first samples the k -mer set using FracMinHash.…”
Section: Discussionmentioning
confidence: 99%
See 4 more Smart Citations
“…In this work, we present HyperGen: a genome sketching tool based on hyperdimensional computing (HDC) (Nunes et al ., 2023; Kanerva, 2009) that improves accuracy, runtime performance, and memory efficiency for large-scale genomic analysis. HyperGen inherits the advantages of both FracMinHash-based sketching (Hera et al ., 2023) and DotHash (Nunes et al ., 2023). HyperGen first samples the k -mer set using FracMinHash.…”
Section: Discussionmentioning
confidence: 99%
“…A recent work (Nunes et al ., 2023) demonstrates that the speed and memory efficiency of Jaccard similarity approximation can be improved by using the DotHash based on Random Indexing (Sahlgren, 2005). The key step to compute Jaccard similarity in Eq.…”
Section: Methodsmentioning
confidence: 99%
See 3 more Smart Citations