Scheduling FFT computation on SMP and multicore systems

Ali, Aiman; Johnsson, Lennart; Subhlok, Jaspal

doi:10.1145/1274971.1275011

Cited by 31 publications

(22 citation statements)

References 19 publications

(15 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…(8). This is straightforward to compute as a single multiplication is needed following the δ m (u, v) summing.…”

Section: Binary Similarity Measuresmentioning

confidence: 99%

“…The FFT becomes more efficient when M, N m, n are large. The computation of the FFT can be supported by parallel implementation that can achieve acceleration factors of 3 to 7 using GPU [8], but raising other constraints linked to application portability.…”

Section: Fast Optimal Binary Template Matchingmentioning

confidence: 99%

“…The major reasons are the no-invariance to scale and rotation, the lack of adaptability of similarity measures and the time-complexity. However, different contributions have been investigated during the last years to improve these aspects including the robustness and discrimination capability of similarity measures [4], [5], their characterization [5], [6], the timeprocessing optimization with hardware support [7], [8], etc.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Fast and Optimal Binary Template Matching Application to Manga Copyright Protection

Delalandre

Iwata

Kise

2014

2014 11th IAPR International Workshop on Document Analysis Systems

View full text Add to dashboard Cite

Abstract-Template matching is a technique used in classifying an object by comparing portions of images with another image. Finding a given template in an image is typically performed by scanning the image and evaluating the similarity with the template. When the scanning is concerned with the entire image template matching is optimal. This paper considers a special case of template matching where the templates are binary. Although binary template matching has been studied extensively since the early days of pattern recognition, this technique seems not longer in use in Document Image Analysis (DIA). The major reasons are the time complexity, the no-invariance to scale and rotation and the lack of adaptability of similarity measures. However, different contributions have been investigated during the last years to improve these aspects: robustness and discrimination capability of similarity measures, their characterization, time-processing optimization with hardware support, etc. In this paper, we will review first some of the recent issues about binary template matching. We will present then a system exploiting bitwise operators and parallel processing supporting fast and accurate binary template matching for Manga copyright protection. This system is compared to a FFT-based template matching, and it outperforms both in processing-time and detection accuracy.

show abstract

“…(8). This is straightforward to compute as a single multiplication is needed following the δ m (u, v) summing.…”

Section: Binary Similarity Measuresmentioning

confidence: 99%

Section: Fast Optimal Binary Template Matchingmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Fast and Optimal Binary Template Matching Application to Manga Copyright Protection

Delalandre

Iwata

Kise

2014

2014 11th IAPR International Workshop on Document Analysis Systems

View full text Add to dashboard Cite

show abstract

“…A system by Kessler et al [19], [20] automatically composes algorithms using emperical techniques. Other autotuning systems include SPARSITY [21] for sparse matrix computations, SPIRAL [22], [23], [24] for digital signal processing, UHFFT [25] for FFT on multicore systems, and OSKI [26] for sparse matrix kernels. ActiveHarmony [27], [28] provides a general framework for tuning configurable libraries and exploring different compiler optimizations.…”

Section: Related Workmentioning

confidence: 99%

Language and compiler support for auto-tuning variable-accuracy algorithms

Ansel

Wong

Chan

et al. 2011

International Symposium on Code Generation and Optimization (CGO 2011)

100

View full text Add to dashboard Cite

Abstract-Approximating ideal program outputs is a common technique for solving computationally difficult problems, for adhering to processing or timing constraints, and for performance optimization in situations where perfect precision is not necessary. To this end, programmers often use approximation algorithms, iterative methods, data resampling, and other heuristics. However, programming such variable accuracy algorithms presents difficult challenges since the optimal algorithms and parameters may change with different accuracy requirements and usage environments. This problem is further compounded when multiple variable accuracy algorithms are nested together due to the complex way that accuracy requirements can propagate across algorithms and because of the size of the set of allowable compositions. As a result, programmers often deal with this issue in an ad-hoc manner that can sometimes violate sound programming practices such as maintaining library abstractions.In this paper, we propose language extensions that expose trade-offs between time and accuracy to the compiler. The compiler performs fully automatic compile-time and installtime autotuning and analyses in order to construct optimized algorithms to achieve any given target accuracy. We present novel compiler techniques and a structured genetic tuning algorithm to search the space of candidate algorithms and accuracies in the presence of recursion and sub-calls to other variable accuracy code. These techniques benefit both the library writer, by providing an easy way to describe and search the parameter and algorithmic choice space, and the library user, by allowing high level specification of accuracy requirements which are then met automatically without the need for the user to understand any algorithm-specific parameters. Additionally, we present a new suite of benchmarks, written in our language, to examine the efficacy of our techniques. Our experimental results show that by relaxing accuracy requirements, we can easily obtain performance improvements ranging from 1.1x to orders of magnitude of speedup.

show abstract

“…In 2007, Ali, A. et al developed a portable framework for FFT algorithms to run on various parallel architectures. The computational framework was also formulated using the language of Kronecker products [14]. In 2008, Rodríguez, D. co-authored an article where a methodology was presented for the high-level partitioning of signal transforms onto distributed hardware architectures using, again, the language of Kronecker products signal algebra [15].…”

Section: Introductionmentioning

confidence: 99%

A Framework for Multiple Object Tracking in Underwater Acoustic MIMO Communication Channels

Rodriguez

Aceros

Valera

et al. 2017

JSAN

View full text Add to dashboard Cite

This work presents a computational framework for the analysis and design of large-scale algorithms utilized in the estimation of acoustic, doubly-dispersive, randomly time-variant, underwater communication channels. Channel estimation results are used, in turn, in the proposed framework for the development of efficient high performance algorithms, based on fast Fourier transformations, for the search, detection, estimation and tracking (SDET) of underwater moving objects through acoustic wavefront signal analysis techniques associated with real-time electronic surveillance and acoustic monitoring (eSAM) operations. Particular importance is given in this work to the estimation of the range and speed of deep underwater moving objects modeled as point targets. The work demonstrates how to use Kronecker products signal algebra (KSA), a branch of finite-dimensional tensor signal algebra, as a mathematical language for the formulation of novel variants of parallel orthogonal matching pursuit (POMP) algorithms, as well as a programming aid for mapping these algorithms to large-scale computational structures, using a modified Kuck's paradigm for parallel computation.

show abstract

Scheduling FFT computation on SMP and multicore systems

Cited by 31 publications

References 19 publications

Fast and Optimal Binary Template Matching Application to Manga Copyright Protection

Fast and Optimal Binary Template Matching Application to Manga Copyright Protection

Language and compiler support for auto-tuning variable-accuracy algorithms

A Framework for Multiple Object Tracking in Underwater Acoustic MIMO Communication Channels

Contact Info

Product

Resources

About