This study deals with the scalability problems induced by high complexity of integrated scheduling for high-speed switches, and presents a simple two-stage integrated scheduling algorithm that supports both unicast and multicast traffic simultaneously. The first stage of the switching fabric performs switching for unicast traffic and load-balancing for multicast traffic, whereas the second stage performs switching for multicast traffic with a new multicast scheduling algorithm to reduce the multicast head-of-line (HoL) blocking problem. Considering the tradeoff balancing complexity and performance, the proposed integrated algorithm performs without iteration, and reduces the scheduling overhead from traditional O(kN) to O(N ) with a two-phase (request-grant) scheduling for unicast and multicast traffic at each stage. Simulation results show that the proposed integrated algorithm exhibits a good performance in terms of throughput and average delay, at different traffic compositions under various traffic patterns, especially with non-uniform traffic.