The main advantages of Tarjan's strongly connected component (SCC) algorithm are its linear time complexity and ability to return SCCs on-the-fly, while traversing or even generating the graph. Until now, most parallel SCC algorithms sacrifice both: they run in quadratic worst-case time and/or require the full graph in advance.
Conformance checking is a branch of process mining that aims to assess to what degree event data originating from the execution of a (business) process and a corresponding reference model conform to each other. Alignments have been recently introduced as a solution for conformance checking and have since rapidly developed into becoming the de facto standard. The state-of-the-art method to compute alignments is based on solving a shortest path problem derived from the reference model and the event data. Within such a shortest path problem, a cost function is used to guide the search to an optimal solution. The standard cost-function treats mismatches in the model and log as equal. In this paper, we consider a variant of this standard cost function which maximizes the number of correct matches instead. We study the effects of using this cost-function compared to the standard cost function on both small and large models using over a thousand generated and industrial case studies. We further show that the alignment computation process can be sped up significantly in specific instances. Finally, we present a new algorithm for the computation of alignments on models with many log traces that is an order of magnitude faster (in maximizing synchronous moves) compared to the state-of-the-art A* based solution method, as a result of a preprocessing step on the model.
Conformance checking is a branch of process mining that aims to assess to what degree a given set of log traces and a corresponding reference model conform to each other. The state-of-the-art approach in conformance checking is based on the concept of alignments. Alignments express the observed behaviour in terms of the reference model while minimizing the number of mismatches between the event data and the model. The currently known best algorithm for constructing alignments applies the A* shortest path algorithm for each trace of event data. In this work, we apply insights from the field of model checking to aid conformance checking. We investigate whether alignments can be computed efficiently via symbolic reachability with decision diagrams. We designed a symbolic algorithm for computing shortest-paths on graphs restricted to 0-and 1-cost edges (which is typical for alignments). We have implemented our approach in the LTSMIN model checking toolset and compare its performance with the A* implementation supported by ProM. We generated more than 4000 experiments (Petri net model and log trace combinations) by setting various parameters, and analysed performance and related these to structural properties. Our empirical study shows that the symbolic technique is in general better suited for computing alignments on large models than the A* approach. Our approach is better performing in cases where the size of the state-space tends to blow up. Based on our experiments we conclude that the techniques are complementary, since there is a significant number of cases where A* outperforms the symbolic technique and vice versa.
The main advantages of Tarjan's strongly connected component (SCC) algorithm are its linear time complexity and ability to return SCCs on-the-fly, while traversing or even generating the graph. Until now, most parallel SCC algorithms sacrifice both: they run in quadratic worst-case time and/or require the full graph in advance. The current paper presents a novel parallel, on-the-fly SCC algorithm. It preserves the linear-time property by letting workers explore the graph randomly while carefully communicating partially completed SCCs. We prove that this strategy is correct. For efficiently communicating partial SCCs, we develop a concurrent, iterable disjoint set structure (combining the union-find data structure with a cyclic list). We demonstrate scalability on a 64-core machine using 75 real-world graphs (from model checking and explicit data graphs), synthetic graphs (combinations of trees, cycles and linear graphs), and random graphs. Previous work did not show speedups for graphs containing a large SCC. We observe that our parallel algorithm is typically 10-30× faster compared to Tarjan's algorithm for graphs containing a large SCC. Comparable performance (with respect to the current state-of-the-art) is obtained for graphs containing many small SCCs.
In the automata theoretic approach to explicit state LTL model checking, the synchronized product of the model and an automaton that represents the negated formula is checked for emptiness. In practice, a (transition-based generalized) Büchi automaton (TGBA) is used for this procedure. This paper investigates whether using a more general form of acceptance, namely a transition-based generalized Rabin automaton (TGRA), improves the model checking procedure. TGRAs can have significantly fewer states than TGBAs; however, the corresponding emptiness checking procedure is more involved. With recent advances in probabilistic model checking and LTL to TGRA translators, it is only natural to ask whether checking a TGRA directly is more advantageous in practice. We designed a multi-core TGRA checking algorithm and performed experiments on a subset of the models and formulas from the 2015 Model Checking Contest and generated LTL formulas for models from the BEEM database. While we found little to no improvement by checking TGRAs directly, we show how various aspects of a TGRA's structure influences the model checking performance. In this paper, we also introduce a Finless acceptance condition, which is a disjunction of TGBAs. We show how to convert TGRAs into automata with Fin-less acceptance and show how a TGBA emptiness procedure can be extended to check Fin-less automata.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.