Abstract-Video streams, either in the form of Video On-Demand (VOD) or live streaming, usually have to be converted (i.e., transcoded) to match the characteristics of viewers' devices (e.g., in terms of spatial resolution or supported formats). Transcoding is a computationally expensive and time-consuming operation. Therefore, streaming service providers have to store numerous transcoded versions of a given video to serve various display devices. With the sharp increase in video streaming, however, this approach is becoming cost-prohibitive. Given the fact that viewers' access pattern to video streams follows a long tail distribution, for the video streams with low access rate, we propose to transcode them in an on-demand (i.e., lazy) manner using cloud computing services. The challenge in utilizing cloud services for on-demand video transcoding, however, is to maintain a robust QoS for viewers and cost-efficiency for streaming service providers. To address this challenge, in this paper, we present the Cloud-based Video Streaming Services (CVS2) architecture. It includes a QoS-aware scheduling component that maps transcoding tasks to the Virtual Machines (VMs) by considering the affinity of the transcoding tasks with the allocated heterogeneous VMs. To maintain robustness in the presence of varying streaming requests, the architecture includes a cost-efficient VM Provisioner component. The component provides a selfconfigurable cluster of heterogeneous VMs. The cluster is reconfigured dynamically to maintain the maximum affinity with the arriving workload. Simulation results obtained under diverse workload conditions demonstrate that CVS2 architecture can maintain a robust QoS for viewers while reducing the incurred cost of the streaming service provider by up to 85%.