2018 IEEE International Conference on Big Data (Big Data) 2018
DOI: 10.1109/bigdata.2018.8622338
|View full text |Cite
|
Sign up to set email alerts
|

Efficient Dimensionality Reduction for Sparse Binary Data

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
12
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
3
2
1

Relationship

1
5

Authors

Journals

citations
Cited by 8 publications
(12 citation statements)
references
References 7 publications
0
12
0
Order By: Relevance
“…Each gene represents a data point and for every gene, the dataset stores the integer-valued read-count of that gene corresponding to each cell -these readcounts form our features. Binary Compression Scheme (BCS) [34] * Hamming LSH (H-LSH) [12] * Feature Hashing (FH) [41] Signed-random projection/SimHash (SH) [9] Kendall rank correlation coefficient (KT) [19] Latent Semantic Analysis (LSA) [11] Latent Dirichlet Allocation (LDA) [6] Multiple Correspondence Analysis (MCA) [5] Non-neg. Matrix Factorisation (NNMF) [24] Variational auto-encoder (VAE) [21] vanilla Principal component analysis (PCA) * BCS and H-LSH are applied on a BinEm embedding Baseline algorithms: The alternative approaches that we compare against are listed in Table 2.…”
Section: Methodsmentioning
confidence: 99%
See 4 more Smart Citations
“…Each gene represents a data point and for every gene, the dataset stores the integer-valued read-count of that gene corresponding to each cell -these readcounts form our features. Binary Compression Scheme (BCS) [34] * Hamming LSH (H-LSH) [12] * Feature Hashing (FH) [41] Signed-random projection/SimHash (SH) [9] Kendall rank correlation coefficient (KT) [19] Latent Semantic Analysis (LSA) [11] Latent Dirichlet Allocation (LDA) [6] Multiple Correspondence Analysis (MCA) [5] Non-neg. Matrix Factorisation (NNMF) [24] Variational auto-encoder (VAE) [21] vanilla Principal component analysis (PCA) * BCS and H-LSH are applied on a BinEm embedding Baseline algorithms: The alternative approaches that we compare against are listed in Table 2.…”
Section: Methodsmentioning
confidence: 99%
“…BinSketch can be applied to this binary sketch to further compress it into low dimensional binary vectors; the original pairwise Hamming distances can then be approximated from those vectors. Note that there are other known compression algorithms for binary vectors such as BCS [34]. However, we prefer to use BinSketch as it offers both better theoretical as well as practical guarantees on the quality of its estimation.…”
Section: Related Workmentioning
confidence: 99%
See 3 more Smart Citations