Design and Analysis of Approximate Hardware Accelerators for VVC Intra Angular Prediction
Abstract
The Versatile Video Coding (VVC) standard significantly improves compression efficiency over its predecessor, HEVC, but at the cost of substantially higher computational complexity, particularly in intra-frame prediction. This stage employs various directional modes, each requiring multiple multiplications between reference samples and constant coefficients. To optimize these operations at hardware accelerators, multiplierless constant multiplication (MCM) blocks offer a promising solution. However, VVC's interpolation filters have more than fifty distinct coefficients, making MCM implementations resource-intensive. This work proposes an approximation method to reduce the number of interpolation coefficients by averaging fixed subsets of them, therefore decreasing MCM block size and potentially lowering circuit area and power consumption. Six different MCM block architectures for angular intra prediction are introduced, in which five use the approximation method introduced in this work, and evaluate the trade-off between coefficient reduction and coding efficiency compared with a conventional multiplier architecture. Experimental results in ten videos demonstrate that only two MCM implementations exceed a 4% BD-Rate increase and 2.6% on average in the worst case, while two of the MCM implementations have circuit area reduction of 20% and 44%. For three of the architectures, parallel sample prediction modules were synthesized, showing a reduction of 30% gate area compared to single sample processing units, and a reduction in energy consumption for two of the implementations.