We present in this paper several implementations of the 3D Fast Wavelet Transform (3D-FWT) on multicore CPUs and manycore GPUs. On the GPU side, we focus on CUDA and OpenCL programming to develop methods for an efficient mapping on manycores. On multicore CPUs, OpenMP and Pthreads are used as counterparts to maximize parallelism, and renowned techniques like tiling and blocking are exploited to optimize the use of memory. We evaluate these proposals and make a comparison between a new Fermi Tesla C2050 and an Intel Core 2 Quad Q6700. Speedups of the CUDA version are the best results, improving the execution times on CPU, ranging from 5.3x to 7.4x for different image sizes, and up to 81 times faster when communications are neglected. Meanwhile, OpenCL obtains solid gains which range from 2x factors on small frame sizes to 3x factors on larger ones.
GPUs have recently attracted our attention as accelerators on a wide variety of algorithms, including assorted examples within the image analysis field. Among them, wavelets are gaining popularity as solid tools for data mining and video compression, though this comes at the expense of a high computational cost. After proving the effectiveness of the GPU for accelerating the 2D Fast Wavelet Transform [1], we present in this paper a novel implementation on manycore GPUs and multicore CPUs for a high performance computation of the 3D Fast Wavelet Transform (3D-FWT). This algorithm poses a challenging access pattern on matrix operators demanding high sustainable bandwidth, as well as mathematical functions with remarkable arithmetic intensity on ALUs and FPUs. On the GPU side, we focus on CUDA programming to develop methods for an efficient mapping on manycores and to fully exploit the memory hierarchy, whose management is explicit by the programmer. On multicore CPUs, OpenMP and Pthreads are used as counterparts to maximize parallelism, and renowned techniques like tiling and blocking are exploited to optimize the use of memory. Experimental results on an Nvidia Tesla C870 GPU and an Intel Core 2 Quad Q6700 CPU indicate that our implementation runs three times faster on the Tesla and up to fifteen times faster when communications are neglected, which enables the GPU for processing real-time videos in many applications where the 3D-FWT is involved.
resumenLorca es una ciudad de tamaño medio que, por su localización estratégica en el corredor Mediterráneo, ofrece diferentes oportunidades de negocio. Tradicionalmente la actividad industrial ha ocupado un lugar secundario en la economía local en relación a otros sectores productivos que han acaparado la mano de obra y la riqueza generada. A pesar de ello, resulta muy notable el impacto que las actividades industriales tienen en el espacio urbano y periurbano de Lorca, encontrando tipologías de localización muy diversas que responden a intereses empresariales y a la planificación y práctica urbanística llevada a cabo en el municipio. A sectores industriales maduros en franca decadencia, se han unido otros más modernos en fase de expansión.Como en otras ocasiones, la industria, en este momento de aguda crisis económica internacional, tampoco tiene capacidad en Lorca para absorber la mano de obra excedente de otros sectores, lo cual no satisface la imperiosa necesidad de diversificar y modernizar con nuevas alternativas el tradicional modelo económico lorquino.Palabras clave: actividad industrial, población activa, planificación urbana, impacto territorial, estrategias de localización industrial, Lorca.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.