“…Due to its independence on the underlying architecture and their low complexity, weight sharing quantization has found a large application: here, the weights are first partitioned into multiple categories, then within each category a representative value is selected and used to replace all weights in that category. Such methods mainly differ in the way they subdivide the network weights, e.g., by means of clustering techniques [29], statistical methods [30,31], uniform schemes [32], or by minimizing the distortion and the entropy of the coded source [33]. We will describe these methods in detail in Section 3.…”