Numerical techniques have long been used to compute an approximate solution of a definite integral. The traditional approaches have mostly been software oriented. However, with the current trend moving back towards hardware intensive processing, it is desirable to develop a hardware oriented solution that assesses the performance in terms of some realistic parameters such as speed, power and area. This paper aims to exploit the one-to-one correspondence that exists between the Integration algorithms and the general FIR filters. Based on this correspondence a structure is developed that implements the Integration algorithm. However, typically such implementations have large critical path delays that put a limit on the resulting sampling/throughput rates. The paper addresses this problem by exploiting concurrency at various levels within the algorithm. Pipelined and parallel structures are developed and their effects on speed and power metrics are studied separately. It is shown that by these architectural modifications the data paths within the structure can be modified and the structure can be operated at higher throughput rates and/or with lower power consumption. Because of their ability to provide a high level of hardware programmability, FPGAs have been used as the implementation platform.