We develop a theoretical framework for defining and identifying flows of information in computational systems. Here, a computational system is assumed to be a directed graph, with "clocked" nodes that send transmissions to each other along the edges of the graph at discrete points in time. A few measures of information flow have been proposed previously in the literature, and measures of directed causal influence are currently being used as a heuristic proxy for information flow. However, there is as yet no rigorous treatment of the problem with formal definitions and clearly stated assumptions, and the process of defining information flow is often conflated with the problem of estimating it. In this work, we provide a new information-theoretic definition for information flow in a computational system, which we motivate using a series of examples. We then show that this definition satisfies intuitively desirable properties, including the existence of "information paths", along which information flows from the input of the computational system to its output. Finally, we describe how information flow might be estimated in a noiseless setting, and provide an algorithm to identify information paths on the time-unrolled graph of a computational system. 1 Causal in the "Signals and Systems" sense of the word, where a node cannot make use of future transmissions [41]. 2 Although the work of Ahlswede et al. (2000) is titled "Network Information Flow", it actually addresses a different problem: one of the achievable rate region of a broadcast network and the optimal coding strategy that achieves this rate. In contrast to their work, which concentrates on characterizing and achieving the optimal rate, our focus is on understanding how information Figure 1: A diagram showing an example of a how a complete directed graph is unrolled to create a time-unrolled graph. On the left, we show a complete directed graph G * that has three nodes, V * = {A, B, C}. These nodes are fully connected to each other via edges E * , including self-edges. On the right, we show how G * has been unrolled using time indices T = {0, 1, 2} to obtain a time-unrolled graph G. The set of all nodes at time t = 0 is V 0 and the set of all (outgoing) edges at time t = 0 is denoted E 0 . As an example, we have shown an arbitrary edge E 0 ∈ E 0 (here, E 0 = (C 0 , B 1 )) and the transmission on that edge, X(E 0 ). As another example, we show a "self-edge" in the time-unrolled graph, E 1 ∈ E 1 , which in this case is E 1 = (A 1 , A 2 ). Also depicted is the transmission X(E 1 ) on this self-edge, which is interpreted as the contents of the memory of node A from t = 1 to t = 2. The message M arrives at the input node A 0 , but could in general be available at more than one node at t = 0. communicate to each other over time. We define a random variable model for the nodes' transmissions, and demonstrate how each node computes them. We also explain what we mean by a "message", and formally define the input nodes of the computational system. Definition 1 (Comple...