SUMMARYTo obtain an effective and efficient system, a multitude of architectural solutions must be assessed and their performances compared. Design and evaluation of several hardware architectures require enormous time for their development and the evaluation. In this paper, an exploration of possible architectures for the 2D-discrete wavelet transform (DWT) is accomplished. Particularly, filters used by the standard JPEG2000 are studied: the lossy 9/7 and the lossless 5/3 filters. The designs have been coded with high description level using Handel-C language and the target hardware is Xilinx FPGA of the Virtex-4 family. It is known that the technique of in-place calculation of the 2D-DWT by lifting scheme allows saving space memory, but the problem is that the computed coefficients are not stored in consecutive addresses, and a system for address decoding becomes necessary. The address decoding must be made at every new decomposition level. We propose then a new simple and efficient technique that allows storing the calculated coefficients in a homogeneous space memory and in consecutive addresses. No additional memory or address decoding is necessary. Finally, the designed system allows a rather high throughput and optimized number of hardware resources.