2022
DOI: 10.1089/cmb.2022.0275
|View full text |Cite
|
Sign up to set email alerts
|

Differentiable Learning of Sequence-Specific Minimizer Schemes with DeepMinimizer

Abstract: Minimizers are widely used to sample representative k -mers from biological sequences in many applications, such as read mapping and taxonomy prediction. In most scenarios, having the minimizer scheme select as few k -mer positions as possible (i.e., having a low density) is desirable to reduce computation and memory cost. Despite the growing interest in minimizers, learning an effective scheme with optimal density is still an open question, as it requires solving … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
19
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
2
2
1

Relationship

2
3

Authors

Journals

citations
Cited by 6 publications
(19 citation statements)
references
References 22 publications
0
19
0
Order By: Relevance
“…The density factor normalizes density for the window size w of the scheme. We follow the definition of Zheng et al [8]: for a sequence S the density factor is df (S) = factor removes the dependence on L, e.g. making the expected density factor of all random minimizers the same, regardless of k and L. Note that other works define the density factor simply as the density times a factor of (w +1) (c.f.…”
Section: Minimizer Densitymentioning
confidence: 99%
“…The density factor normalizes density for the window size w of the scheme. We follow the definition of Zheng et al [8]: for a sequence S the density factor is df (S) = factor removes the dependence on L, e.g. making the expected density factor of all random minimizers the same, regardless of k and L. Note that other works define the density factor simply as the density times a factor of (w +1) (c.f.…”
Section: Minimizer Densitymentioning
confidence: 99%
“…The sampling function of a minimizer scheme is characterized by a tuple of parameters ( w, k, π ), where w and k are defined above. Additionally, π is a total ordering over the set of all k -mers, which can be represented [7] as a scoring function , such that for every pair of k -mers k, κ′ ∈ Σ k : …”
Section: Background and Notationmentioning
confidence: 99%
“…Varying the mask configuration induces a spectrum of comparable schemes, including minimizers (i.e., all-ones mask) and various syncmer schemes (i.e., one-hot masks). This unification reveals a methodical approach to derive comparable sketching schemes via combining existing minimizer construction/optimization techniques [7, 5, 21, 22] with a mask optimization routine.…”
Section: Introductionmentioning
confidence: 99%
See 2 more Smart Citations