Saliency map is a central part of many visual attention systems, particularly during learning and control of bottom-up attention. In this research we developed a hardware tool to extract saliency map from a video sequence. Saliency map is obtained by aggregating primary features of each frame, such as intensity, color, and lines orientation, along with temporal difference. The system is designed to provide both high speed and acceptable accuracy for real-time applications, such as machine vision and robotics. A versatile Verilog model for realization of the video processing system is developed, which can easily be mapped and synthesized on various FPGA or ASIC platforms. The proposed parallel hardware can process over 50 million pixels in a second, which is about 2x faster than the state-of-theart designs. Experimental results on sample images justify the applicability and efficiency of the developed system in real-time applications.