This paper presents the design and implementation of a dedicated hardware architecture for binary arithmetic decoder (BAD) engines of CABAD, as defined in the H.264/AVC video compression standard. The BAD is the most important CABAD process, which is the main entropy encoding method defined by the H.264/AVC standard. The BAD is composed by tree engines: Regular, Bypass and Terminate. A large set of software experiments was made to profile each engine. Based on bitstream flow analysis a new dedicated hardware architecture was proposed to improve the hardware efficiency of BAD engines. The proposed solution was described in VHDL and synthesized to a Xilinx Virtex2-Pro FPGA. The results show that the developed architecture reaches 103 MHz, and delivers up to 4 bins per cycle in bypass engines, against 2 bins per cycle as exposed in the literature.