Hadoop is a typical framework for processing big data. Task scheduling algorithms have a significant impact on the processing performance of Hadoop clusters. Existing scheduling algorithms of Hadoop fail to consider the performance differences between nodes in heterogeneous Hadoop clusters, causing problems such as uneven task allocation and low resource utilization. Aiming to solve this problem, we propose a spider monkey optimization-based scheduling algorithm (SMOSA) for heterogeneous Hadoop. First, the cluster heartbeat mechanism is used to obtain information such as memories and CPUs of nodes to comprehensively consider the actual load capacity of each node. Then, the spider monkey optimization algorithm is adopted to find the optimal mapping relationship between tasks and resources by taking the task completion time as the objective function and updating the position of the spider monkey.Finally, we calculate the remaining rate of node hardware resources, and according to the task type, the node with the higher remaining rate of resource will give priority to the task. Data are compressed for I/O type tasks to reduce disk operations and increase the speed of task execution. Experimental results show that, compared with existing scheduling algorithms, the SMOSA can effectively reduce task execution time and can significantly improve scheduling efficiency and task execution speed especially in heterogeneous Hadoop clusters. For different types of tasks, the execution time can be reduced by up to 19%.