Causality has been the issue of philosophic debate since Hippocrates. It is used in formal verification and testing, e.g., to explain counterexamples or construct fault trees. Recent work defines actual causation in terms of Pearl's causality framework, but most definitions brought forward so far struggle with examples where one event preempts another one. A key point to capturing such examples in the context of programs or distributed systems is a sound treatment of control flow. We discuss how causal models should incorporate control flow and discover that much of what Pearl/Halpern's notion of contingencies tries to capture is captured better by an explicit modelling of the control flow in terms of structural equations and an arguably simpler definition. Inspired by causality notions in the security domain, we bring forward a definition of causality that takes these control-variables into account. This definition provides a clear picture of the interaction between control flow and causality and captures these notoriously difficult preemption examples without secondary concepts. We give convincing results on a benchmark of 34 examples from the literature.
NotationWe write t for a sequence t 1 , . . . ,t n if n is clear from the context and use (a 1 , . . . , a n ) · (b 1 , . . . , b m ) = (a 1 , . . . , a n , b 1 , . . . , b m ) to denote concatenation. We filter a sequence l by a set S, denoted l| S , by removing each element that is not in S.
Causality framework (Review)We review the causality framework introduced by Pearl [23], also known as the structural equations model. The causality framework models how random variables influence each other. The set of random variables, which we assume discrete, is partitioned into a set U of exogenous variables, variables that are outside the model, e.g., in the case of a security protocol, the scheduling and the attack the adversary decides to mount, and a set V of endogenous variables, which are ultimately determined by the value of the exogenous variables. A signature is a triple consisting of U , V and function R associating a range, i.e., a set, to each variable Y ∈ U ∪ V . A causal model on this signature defines the relation between endogenous variables and exogenous variables or other endogenous variables in terms of a set of equations.Definition 1 (Causal model). A causal model M over a signature S = (U , V , R) is a pair of said signature S and a set of functions F = { F X } X∈V such that, for each X ∈ V ,Each causal model induces a causal network, a graph with a node for each variable in V , and an edge from X to Y iff F Y depends on X. (Y depends on X iff there is a setting for the variables in V ∪U \{ X,Y } 1. V can be partitioned into V rch and a set of data variables V dat , 2. R(V rch ) = { , ⊥ } for any V rch ∈ V rch , and, Table 1: Set of causes for various examples from the literature. The suffix (ctl) marks models adapted to Definition 5.