“…In this paper, we use a re-implementation of the CPC model [34], which we call CPC2. The encoder architecture is the same (5 convolutional layers with kernel sizes [10,8,4,4,4], strides [5,4,2,2,2] and hidden dimension 256), for the context network, we used 2-layer LSTM, and for the prediction network, we used a multi-head transformer [35], each of the 12 heads predicting one future time slice.…”