This technical report tries to fill a gap in current literature on Timescale Graphical Event Models. I propose and evaluate different heuristics to determine hyper-parameters during the structure learning algorithm and refine an existing distance measure. A comprehensive benchmark on synthetic data will be conducted allowing conclusions about the applicability of the different heuristics.
Graphical Event ModelsThis chapter introduces the class Graphical Event Models and in particular Timescale Graphical Event Models (Gunawardana and Meek 2016). After a reminder of the general framework and its notation, I will recap the structure learning algorithm and discuss different heuristics to choose hyper-parameters. Further, I briefly explain how to generate synthetic data. Finally, I propose a refined distance measure to evaluate how similar two Timescale Graphical Event Models are. For the sake of consistency, definitions and notations are adopted from the original work of Gunawardana and Meek (2016). Event streams and their temporal dynamics can be represented as a multivariate temporal point process and the literature offers several advanced methods such as Continuous Time Bayesian Networks (Nodelman et al. 2002), Poisson Networks (Rajaram, Graepel, and Herbrich 2005), Conjoint Piecewise-Constant Conditional Intensity Models (Parikh, Gunawardana, and Meek 2012), or Multiplicative-Forest Point Processes (Weiss and Page 2013). They commonly share the concept of conditional intensity functions to express the rate at which a specific event occurs, conditioned on previous event occurrences.Graphical Event Models (GEMs) (Didelez 2008;Meek 2014;Gunawardana and Meek 2016) provided a framework that generalizes before-mentioned models. Moreover, Gunawardana and Meek (2016) showed that GEMS can universally approximate any smooth multivariate temporal point process. GEMs provide a compact graphical representation of such process where different events are represented as nodes and an edge from node A to node B implies that the appearance of event A has some influence on the occurrence of event B. In addition to this qualitative information about temporal dependencies, GEMs also contain quantitative information about these dynamics in terms of conditional intensity functions.
PreliminariesGunawardana and Meek (2016) denote a stream of events as (t, l) ∈ R + × L, each of which has a timestamp t > 0 and a label l taken from a finite label vocabulary L. This yields a sequence {(t 1 , l 1 ), . . . , (t i , l i ), . . . , (t n , l n )}, where t 0 = 0 < t i < t i+1 < t * and 1 ≤ i ≤ n − 1.Let further be x t * the sequence of events {(t i , l i ) : t i < t * } until time t * and h i the ith history h i = (t 1 , l 1 ), ..., (t i−1 , l i−1 ).