The future video coding standard named Versatile Video Coding (VVC) is expected by the end of 2020. VVC will enable better coding efficiency than the current High Efficiency Video Coding (HEVC) standard. This coding gain is brought by several coding tools. The Multiple Transform Selection (MTS) is one of the key coding tools that have been introduced in VVC. The MTS concept relies on three transform types including Discrete Cosine Transform (DCT)-II, Discrete Sine Transform (DST)-VII and DCT-VIII. Unlike the DCT-II that has fast computing algorithms, the DST-VII and DCT-VIII rely on more complex matrix multiplication. In this paper an approximation approach is proposed to reduce the computational cost of the DST-VII and DCT-VIII. The approximation consists in applying adjustment stages, based on sparse block-band matrices, to a variant of DCT-II family mainly DCT-II and its inverse. Genetic algorithm is used to derive the optimal coefficients of the adjustment matrices. Moreover, an efficient hardware implementation of the forward and inverse approximate transform module is proposed. The architecture design includes a pipelined and reconfigurable forward-inverse DCT-II core transform as it is the main core for DST-VII and DCT-VIII computations. The proposed 32-point 1D architecture including low cost adjustment stages allows the processing of a video in 2K and 4K resolutions at 1095 and 273 frames per second, respectively. A unified 2D implementation of forwardinverse DCT-II, approximate DST-VII and DCT-VIII is also presented. The synthesis results show that the design is able to sustain a video in 2K and 4K resolutions at 386 and 96 frames per second, respectively, while using only 12% of Alms, 22% of registers and 30% of DSP blocks of the Arria10 SoC platform.
Versatile Video Coding (VVC) is the next generation video coding standard expected by the end of 2020. VVC introduces several new coding tools that enable better coding performance compared to the High Efficiency Video Coding (HEVC) standard. The Multiple Transform Selection (MTS) concept, as introduced in VVC, relies on three trigonometrical transforms, and at the encoder side, selects the couple of horizontal and vertical transforms that maximises the Rate-Distortion cost. However, the new Discrete Sine Transform (DST)-VII and Discrete Cosine Transform (DCT)-VIII do not have fast computing algorithms and rely on matrix multiplication, which requires high hardware resources especially for large block sizes. This paper tackles the hardware implementation of an approximation of MTS module. This approximation consists in applying adjustment stages, based on sparse block-band matrices, to a variants of DCT-II family mainly DCT-II and its inverse. Therefore, an efficient 2D hardware implementation of the forward and inverse approximate transform module is proposed. The architecture design includes a pipelined and reconfigurable forward-inverse DCT-II core transform. A unified 2D implementation of 16 and 32-point forward-inverse DCT-II, approximate DST-VII and DCT-VIII is also presented. The synthesis results show that the design is able to sustain 2K and 4K videos at 377 and 94 frames per second, respectively, while using only 18% of Alms, 40% of registers and 34% of Digital Signal Processing (DSP) blocks of the Arria10 SoC platform.
Versatile Video Coding (VVC) is the next generation video coding standard expected by the end of 2020. The new concept of Multiple-Transform Selection (MTS) has been introduced in VVC. MTS enables the VVC encoder to select the transform that minimizes the rate-distortion cost among a set of pre-defined trigonometric transforms including the well known Discrete Cosine Transform (DCT)-II, DCT-VIII and Discrete Sine Transform (DST)-VII. Unlike the DCT-II that has fast computing algorithms, the DST-VII and DCT-VIII rely on more complex matrix multiplication. This paper tackles the problem of DST-VII and DCT-VIII approximations based on the DCT-II and an adjustment stage. This latter consists in a multiplication by a band-matrix with low number of non-zero coefficients per row. The approximation problem is first modeled as a constrained integer optimization problem minimizing both error and orthogonality. The genetic algorithm is then used to solve the optimization problem and find the adjustment band-matrix that minimizes a trade-off between error and orthogonality. The proposed solution enables to preserve the coding gain achieved by the MTS and considerably reduces the complexity in terms of required number of multiplications by coefficient. Moreover, the proposed approach is hardwarefriendly and will provide a lightweight shared hardware module for DST-II, DST-VII and DCT-VIII transforms.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.