We consider the following communication scenario. An encoder causally observes the Wiener process and decides when and what to transmit about it. A decoder estimates the process using causally received codewords in real time. We determine the causal encoding and decoding policies that jointly minimize the mean-square estimation error, under the long-term communication rate constraint of R bits per second. We show that an optimal encoding policy can be implemented as a causal sampling policy followed by a causal compressing policy. We prove that the optimal encoding policy samples the Wiener process once the innovation passes either 1 R or 1 R , and compresses the sign of the innovation (SOI) using a 1-bit codeword. The SOI coding scheme achieves the operational distortion-rate function, which is equal to D op (R) = 1 6R . Surprisingly, this is significantly better than the distortion-rate tradeoff achieved in the limit of infinite delay by the best non-causal code. This is because the SOI coding scheme leverages the free timing information supplied by the zero-delay channel between the encoder and the decoder. The key to unlocking that gain is the event-triggered nature of the SOI sampling policy. In contrast, the distortion-rate tradeoffs achieved with deterministic sampling policies are much worse: we prove that the causal informational distortion-rate function in that scenario is as high as DDET(R) = 5 6R . It is achieved by the uniform sampling policy with the sampling interval 1 R . In either case, the optimal strategy is to sample the process as fast as possible and to transmit 1-bit codewords to the decoder without delay. We show that the SOI coding scheme also minimizes the mean-square cost of a continuous-time control system driven by the Wiener process and controlled via rate-constrained impulses.