The emerging video coding standard, H.264/AVC, exhibits the unprecedented coding performance. Comparing to traditional coders, e.g., MPEG-2 and MEPG-4 ASP, about half bitrate saving is shown in the official verification test. Such outstanding performance makes it become the video compression candidate for the upcoming HD-DVD. As a side effect, it was also blamed that H.264/AVC is much more logically complex and requires more computation power than any of the existing standards. A low-cost and efficient implementation of the international standard hence plays an important role of its success. In this paper, we realize an H.264/AVC baseline decoder by a low-cost DSP processor, i.e., Philips' TriMedia TM-1300, and illustrate that less computation demand for H.264/AVC decoding becomes feasible by using effective software core. To this end, we first consider different approaches and take advantage of SIMD instruction set to optimize critical time-consuming coding modules, such as the fractional motion compensation, spatial prediction and inverse transform. Next, we also present some other optimization approaches for entropy decoding and in-loop deblocking filtering, even though they cannot get benefits from utilizing SIMD. In our experiments, by exploiting appropriate instruction level parallelism and efficient algorithms, the decoding speed can be improved by a factor of 8~10; a CIF video sequence can be decoded at up to 19.74~28.97 fps on a 166-MHz TriMedia TM-1300 processor compared to 2.40~2.98 fps by the standard reference software.