On Prediction Using Variable Order Markov Models

Begleiter, Ron; El‐Yaniv, Ran; Yona, Golan

doi:10.1613/jair.1491

Cited by 301 publications

(287 citation statements)

References 64 publications

Supporting

Mentioning

286

Contrasting

Unclassified

Order By: Relevance

“…The crucial essence in compression is estimating the conditional probability for the next outcome given the past observations, so those symbols (and sub-sequences) with high conditional probabilities are assigned short codes . long sequences, attain the entropy lower bound (Begleiter et al, 2004). Thus, constructing a data-compression model that minimizes the average log-loss score of a sequence is equivalent to constructing a prediction model that maximizes the likelihood of a sequence.…”

Section: Introductionmentioning

confidence: 99%

“…Therefore, in this study the terms "compression" and "prediction" are often considered as equivalent terms. Note that although other universal-compression algorithms can be used for prediction, the used VOM modela variation of Rissanen"s (1983) context tree -has been shown to attain the best asymptotic convergence rate for a given sequence (Ziv 2001, Begleiter et al, 2004.…”

Section: Introductionmentioning

confidence: 99%

“…The VOM model which we used here is a variant of the Prediction by Partial Match (PPM) tree, which was found in Begleiter et al (2004) to outperform other variants of the VOM model. Our version of the VOM model is different in its parameterization, growth, smoothing procedure and pruning stages from the previous versions of the model.…”

Section: Introduction To the Vommentioning

confidence: 99%

“…Our version of the VOM model is different in its parameterization, growth, smoothing procedure and pruning stages from the previous versions of the model. These differences might be significant, especially when applying the model to small datasets (Buhlmann and Wyner, 1999, Ziv, 2001, Begleiter et al, 2004). …”

Section: Introduction To the Vommentioning

confidence: 99%

“…We follow the explanations and style in Begleiter et al (2004) and Ben-Gal et al (2003) that contain further details on the model and its construction. and, hence, can be considered as special cases of the VOM models.…”

Section: Introduction To the Vommentioning

confidence: 99%

See 4 more Smart Citations

Measuring the Efficiency of the Intraday Forex Market with a Universal Data Compression Algorithm

et al. 2008

View full text Add to dashboard Cite

Universal compression algorithms can detect recurring patterns in any type of temporal data -including financial data -for the purpose of compression. The universal algorithms actually find a model of the data that can be used for either compression or prediction. We present a universal Variable Order Markov (VOM) model and use it to test of the weak form of the Efficient Market Hypothesis (EMH).The EMH is tested for 12 pairs of international intra-day currency exchange rates for one year series of 1,5,10,15,20,25 and 30 minutes. Statistically significant compression is detected in all the time-series and the high frequency series are also predictable above random. However, the predictability of the model is not sufficient to generate a profitable trading strategy, thus, Forex market turns out to be efficient, at least most of the time.

show abstract

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introduction To the Vommentioning

confidence: 99%

Section: Introduction To the Vommentioning

confidence: 99%

Section: Introduction To the Vommentioning

confidence: 99%

See 3 more Smart Citations

Measuring the Efficiency of the Intraday Forex Market with a Universal Data Compression Algorithm

et al. 2008

View full text Add to dashboard Cite

show abstract

Bibliography

2015

Activity Learning

View full text Add to dashboard Cite

Robustness of journal rankings by network flows with different amounts of memory

Bohlin

Esquivel

Lancichinetti

et al. 2015

Asso for Info Science & Tech

View full text Add to dashboard Cite

As the number of scientific journals has multiplied, journal rankings have become increasingly important for scientific decisions. From submissions and subscriptions to grants and hirings, researchers, policy makers, and funding agencies make important decisions with influence from journal rankings such as the ISI journal impact factor. Typically, the rankings are derived from the citation network between a selection of journals and unavoidably depend on this selection. However, little is known about how robust rankings are to the selection of included journals. Here we compare the robustness of three journal rankings based on network flows induced on citation networks. They model pathways of researchers navigating scholarly literature, stepping between journals and remembering their previous steps to different degree: zero-step memory as impact factor, one-step memory as Eigenfactor, and two-step memory, corresponding to zero-, first-, and second-order Markov models of citation flow between journals. We conclude that higher-order Markov models perform better and are more robust to the selection of journals. Whereas our analysis indicates that higher-order models perform better, the performance gain for the secondorder Markov model comes at the cost of requiring more citation data over a longer time period.Science builds on previous science in a recursive quest for new knowledge (1-3). Researchers put great effort into finding the best work by other researchers and into achieving maximum visibility of their own work. Therefore, they both search for good work and seek to publish in prominent journals. Inevitably, where researchers publish becomes a proxy for how good their work is, which in turn influences decisions regarding hiring, promotion, and tenure, as well as university rankings and academic funding (4, 5). As a consequence, researchers depend on the perceived importance of the journals they publish in. While actually reading the work published in a journal is the only way to qualitatively evaluate the scientific content, different metrics are nevertheless used to quantitatively assess the importance of scientific journals (6-13). In different ways, the metrics extract information from the network of citations between articles published in the journals.In this paper, we analyze three flow-based journal rankings (12-14) that at different order of approximations seek to capture the pathways of researchers navigating scholarly literature. Specifically, the metrics measure the journal visit frequency of random walk processes that correspond to zero-, first-, and second-order Markov models. That is, given a citation network between journals and a random walker following the citations, movements in a zero-order model are independent of the cur- * Electronic address: ludvig.bohlin@physics.umu.se; Corresponding author † Electronic address: a.viamontes.esquivel@physics.umu.se ‡ Electronic address: andrea.lancichinetti@physics.umu.se § Electronic address: martin.rosvall@physics.umu.se rently visited journal, moveme...

show abstract

On Prediction Using Variable Order Markov Models

Cited by 301 publications

References 64 publications

Measuring the Efficiency of the Intraday Forex Market with a Universal Data Compression Algorithm

Measuring the Efficiency of the Intraday Forex Market with a Universal Data Compression Algorithm

Bibliography

Robustness of journal rankings by network flows with different amounts of memory

Contact Info

Product

Resources

About