2016 IEEE 16th International Conference on Data Mining (ICDM) 2016
DOI: 10.1109/icdm.2016.0092
|View full text |Cite
|
Sign up to set email alerts
|

DESQ: Frequent Sequence Mining with Subsequence Constraints

Abstract: Abstract-Frequent sequence mining methods often make use of constraints to control which subsequences should be mined; e.g., length, gap, span, regular-expression, and hierarchy constraints. We show that many subsequence constraints-including and beyond those considered in the literature-can be unified in a single framework. In more detail, we propose a set of simple and intuitive "pattern expressions" to describe subsequence constraints and explore algorithms for efficiently mining frequent subsequences under… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
16
0

Year Published

2016
2016
2023
2023

Publication Types

Select...
3
2

Relationship

2
3

Authors

Journals

citations
Cited by 5 publications
(19 citation statements)
references
References 29 publications
0
16
0
Order By: Relevance
“…* (B). * ] * d, it will consider 2 Θ(n) many partial runs that capture the b i , but none of them lead to acceptance, since T does not end with d. 11 We will prove that the two-pass approach successfully avoids such exponential computations. More precisely, it can enumerate the elements of G A (T ) in linear delay, i.e., it can compute a first element in G A (T ) in time O (|T |) and, from there on, we can always compute a new element G A (T ) in time O (|T |) or conclude that no such element exists.…”
Section: Worst-case Runtime Of Two-pass Approachmentioning
confidence: 95%
See 3 more Smart Citations
“…* (B). * ] * d, it will consider 2 Θ(n) many partial runs that capture the b i , but none of them lead to acceptance, since T does not end with d. 11 We will prove that the two-pass approach successfully avoids such exponential computations. More precisely, it can enumerate the elements of G A (T ) in linear delay, i.e., it can compute a first element in G A (T ) in time O (|T |) and, from there on, we can always compute a new element G A (T ) in time O (|T |) or conclude that no such element exists.…”
Section: Worst-case Runtime Of Two-pass Approachmentioning
confidence: 95%
“…If there are multiple such transitions, then we select them one by one (via backtracking). As we move from state to state, we append items that are encoded by the output labels (column "Produces" in Table 2) of the selected transitions to an output buffer (S, lines [10][11][12][13][14][15][16][17][18][19]. As before, if a transition encodes more than one output item, we append them one by ALGORITHM 1: Naive sFST simulation Require: if S ϵ then 7:…”
Section: Computational Modelmentioning
confidence: 99%
See 2 more Smart Citations
“…One approach to improve flexibility is the use of subsequence constraints, which specify conditions under which a subsequence is potentially interesting to the particular application. Ordered by increasing flexibility, common types of subsequence constraints include length constraints [28], [34], gap and duration constraints [14], [28], [34], hierarchy constraints [28], "output filter" regular expression constraints [2], [3], [13], [31], and regular expression constraints with capture groups and hierarchies [5], [7]. The latter type subsumes the remaining ones, and we subsequently refer to it as flexible constraints.…”
Section: Introductionmentioning
confidence: 99%