Hacker's Delight

Wark, McKenzie

doi:10.3917/rdes.055.0118

Cited by 14 publications

(17 citation statements)

References 0 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Even today, many MCUs do not have an extended instruction set, so such techniques are very important. For example, replacing division by a product of reciprocal constants and utilizing bit shifting for multiplying and dividing powers of two are well-known techniques [25]. Computational tricks using the floating-point structure of the IEEE754 are effective, and useful techniques are known for obtaining fast approximations of exponential and logarithmic functions [21,26] and inverse square roots (reciprocal sqrt) [25,[27][28][29][30][31][32].…”

Section: Introductionmentioning

confidence: 99%

“…For example, replacing division by a product of reciprocal constants and utilizing bit shifting for multiplying and dividing powers of two are well-known techniques [25]. Computational tricks using the floating-point structure of the IEEE754 are effective, and useful techniques are known for obtaining fast approximations of exponential and logarithmic functions [21,26] and inverse square roots (reciprocal sqrt) [25,[27][28][29][30][31][32]. In particular, the latter is a well-known algorithm called the fast inverse square root (FISR), which is a useful technique that can be used to find the reciprocal and square root.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Fast and Accurate Approximation Methods for Trigonometric and Arctangent Calculations for Low-Performance Computers

Kusaka¹,

Tanaka

2022

Electronics

View full text Add to dashboard Cite

In modern computers, complicated signal processing is highly optimized with the use of compilers and high-speed processing using floating-point units (FPUs); therefore, programmers have little opportunity to care about each process. However, a highly accurate approximation can be processed in a small number of computation cycles, which may be useful when embedded in a field-programmable gate array (FPGA) or micro controller unit (MCU), or when performing many large-scale operations on a graphics processing unit (GPU). It is necessary to devise algorithms to obtain the desired calculated values without an accelerator or compiler assistance. The residual correction method (RCM) developed here can produce simple and accurate approximations of certain nonlinear functions with minimal multiply–add operations. In this study, we designed an algorithm for the approximate computation of trigonometric and inverse trigonometric functions, which are nonlinear elementary functions, to achieve their fast and accurate computation. A fast first approximation and a more accurate second approximation of each function were created using RCM with a less than 0.001 error using multiply–add operations only. This achievement is particularly useful for MCUs, which have a low power consumption but limited computational power, and the proposed approximations are candidate algorithms that can be used to stabilize the attitude control of robots and drones, which require real-time processing.

show abstract

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Fast and Accurate Approximation Methods for Trigonometric and Arctangent Calculations for Low-Performance Computers

Kusaka¹,

Tanaka

2022

Electronics

View full text Add to dashboard Cite

show abstract

“…The former method can be implemented quickly with a greater searching efficiency, and the latter is the method for which recent central processing units (CPUs) have machine-language instructions and lookup tables implemented in advance. 3) For conventional processing with a CPU, the computation speed can be improved by parallelization of the CPU and extension of the memory. However, the processing and storage costs are increased accordingly.…”

Section: Introductionmentioning

confidence: 99%

High-speed image matching with coaxial holographic optical correlator

Ikeda¹,

Watanabe²

2016

Jpn. J. Appl. Phys.

View full text Add to dashboard Cite

A computation speed of more than 100 Gbps is experimentally demonstrated using our developed ultrahigh-speed optical correlator. To verify this high computation speed practically, the computation speeds of our optical correlator and conventional digital image matching are quantitatively compared. We use a population count function that achieves the fastest calculation speed when calculating binary matching by a central processing unit (CPU). The calculation speed of the optical correlator is dramatically faster than that using a CPU (2.40 GHz × 4) and 16 GB of random access memory, especially when the calculation data are large-scale.

show abstract

“…2.2. The resulting code runs about 7 times faster than our previous approach [12], where we applied the linearly scrambled Halton sequence as in Algorithm 2, used permutation tables for the first 32 dimensions in combination with the simultaneous inversion of multiple digits, and replaced divisions and modulo operations by cheaper operations [43].…”

Section: Resultsmentioning

confidence: 99%

“…The efficiency of loop unrolling varies according to code size and application as it may cool the instruction cache. Unless already done by the compiler, costly division and modulo operations may be replaced by cheaper multiplications, shifts, additions, and subtractions [43].…”

Section: Efficient Radical Inversionmentioning

confidence: 99%

Myths of Computer Graphics

Keller

Monte Carlo and Quasi-Monte Carlo Methods 2004

View full text Add to dashboard Cite

Quasi-Monte Carlo methods have become the industry standard in computer graphics. For that purpose, efficient algorithms for low discrepancy sequences are discussed. In addition, numerical pitfalls encountered in practice are revealed. We then take a look at massively parallel quasi-Monte Carlo integro-approximation for image synthesis by light transport simulation. Beyond superior uniformity, low discrepancy points may be optimized with respect to additional criteria, such as noise characteristics at low sampling rates or the quality of low-dimensional projections.

show abstract

Hacker's Delight

Cited by 14 publications

References 0 publications

Fast and Accurate Approximation Methods for Trigonometric and Arctangent Calculations for Low-Performance Computers

Fast and Accurate Approximation Methods for Trigonometric and Arctangent Calculations for Low-Performance Computers

High-speed image matching with coaxial holographic optical correlator

Myths of Computer Graphics

Contact Info

Product

Resources

About