Amongst all the functions constituting the TRMC (Translational and Rotational Motion Compensation) processing chain developed by Galileo Avionica for real-time high-resolution radar imaging, the RMC (Rotational Motion Compensation) function is doubtless the heaviest by the point ofview ofrequired computing processing load.In this paper the possibility to partition and to parallelize across several processors the RMC algorithm is considered. Three different schemes are characterized as an alternative to the basic solution oj·the round-robin scheme (processing ofthe whole signal matrix on one processor). Each scheme is then applied to a real example so to estimate the required processing time; this estimation is based on time references measured by means ofa single processor implementation.
INTRODUCTIONThe idea behind the TRMC processing chain proposed and analyzed by Galileo Avionica for high-resolution radar imaging [I] is that any motion between the radar and the target to be imaged can always be modelled as the composition of pure translational and rotational components [2]. Once the translational component of the motion between radar platform and target is compensated on the range compressed signal, only the effects ofthe radar to target relative rotation remain in the received signal. After having compensated the residual range and Doppler migration due to target rotation, cross-range resolution is obtained by coherently processing (i.e. FFT, Fast Fourier Transforming) the echoes returned from the target at the different aspect angles induced by the rotation. The computational load required by the RMC algorithm described in [3], which compensates the undesired effects ofrotational motion, can become prohibitive if the number of pixels of the image to be produced is very high. When performed by a single computing node this algorithm might be unsuitable for real-time processing, because of the long time required for processing. This is especially true in applications where wide scene patches and sub-metric resolution are needed, since processing time increases with the increase of the signal size.Assuming that a certain number of computing nodes is available, in the following we derive different schemes for distributing the algorithm across multiple processors, so to reduce latency and increase throughput. To this aim we proceed as follows:processes being part of the RMC algorithm are analyzed, and for each of them decomposition possibilities (in range and/or Doppler dimension) are considered1 -4244-1539-X108/$25.00 ©2008 IEEE on the basis ofthe previous analysis, three partitioning schemes are derived;the figured out schemes are applied to a real example~advan-tages and disadvantages of the different solutions are discussed.
ALGORITHMIn the following, "time-domain signal" indicates the rangecompressed profiles collected during the Coherent Integration Time (CIT). Each range profile is composed ofN_RBIN samples and corresponds to the echo received from a single transmitted pulse. The number of pulses collected...