2020
DOI: 10.1007/s00778-020-00619-4
|View full text |Cite
|
Sign up to set email alerts
|

Scalable data series subsequence matching with ULISSE

Abstract: Data series similarity search is an important operation and at the core of several analysis tasks and applications related to data series collections. Despite the fact that data series indexes enable fast similarity search, all existing indexes can only answer queries of a single length (fixed at index construction time), which is a severe limitation. In this work, we propose ULISSE, the first data series index structure designed for answering similarity search queries of variable length (within some range). O… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
7
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
3
3
2

Relationship

3
5

Authors

Journals

citations
Cited by 20 publications
(7 citation statements)
references
References 60 publications
0
7
0
Order By: Relevance
“…This is true for other iSAX-based indices. For example, we could parallelize in a way similar to MESSI the ULISSE index [53], which supports queries of variable length, as well as the DPiSAX index [85], which is a distributed index operating on top of Spark (but currently not supporting parallel execution within each node of the Spark cluster). It is an interesting open problem to study whether there exist efficient parallelization techniques for indexing schemes whose tree index does not satisfy this large fanout property that would result in better perfomance than MESSI.…”
Section: Discussionmentioning
confidence: 99%
“…This is true for other iSAX-based indices. For example, we could parallelize in a way similar to MESSI the ULISSE index [53], which supports queries of variable length, as well as the DPiSAX index [85], which is a distributed index operating on top of Spark (but currently not supporting parallel execution within each node of the Spark cluster). It is an interesting open problem to study whether there exist efficient parallelization techniques for indexing schemes whose tree index does not satisfy this large fanout property that would result in better perfomance than MESSI.…”
Section: Discussionmentioning
confidence: 99%
“…Even though much effort has been dedicated for developping techniques for data series analytics, existing solutions for subsequence matching, motif and discord discovery are limited to fixed length queries/results. In this Ph.D. work, we propose the first scalable solutions to the variable-length version of these problems: ULISSE is the first index that supports variable-length subsequence matching over both Z-normalized and non Z-normalized sequences [15,13,14], while MAD is the first framework that implements variablelength motif and discord discovery [17,4,16].…”
Section: Discussionmentioning
confidence: 99%
“…1. ULISSE (ULtra compact Index for variable-length Similarity SEarch in data series) is the first indexing technique that supports variable-length subsequence matching for non Z-normalized and Z-normalized data series [15,13,14].…”
Section: Introductionmentioning
confidence: 99%
“…Optimized and Approximate Similarity Search. The database community has optimized similarity search methods by using index structures [22,27,28,33,49,72,73,82,83,132,134,139,146] or fast sequential scans [112]. Recently, Echihabi et al [47,48] compared the efficiency of these methods under a unified experimental framework, showing that there is no single best method that outperforms all the rest.…”
Section: Contributionsmentioning
confidence: 99%
“…In this context, progressive answers help to speed-up exact queries by stopping execution early, when it is highly probable that the current progressive answer is the exact one. Note that several data series similarity search methods support approximate query answering that can produce increasingly more accurate answers as time goes by [23,51,73,83,134,146], though, none of them provides quality guarantees on the answers. In this work, we focus on the iSAX2+ [23] and DSTree [134] methods, which exhibit superior performance at the similarity search task [47,48].…”
Section: Contributionsmentioning
confidence: 99%