Unprecedented high volumes of data are becoming available with the growth of the advanced metering infrastructure. These are expected to benefit planning and operation of the future power system, and to help the customers transition from a passive to an active role. In this paper, we explore for the first time in the smart grid context the benefits of using Deep Reinforcement Learning, a hybrid type of methods that combines Reinforcement Learning with Deep Learning, to perform on-line optimization of schedules for building energy management systems. The learning procedure was explored using two methods, Deep Q-learning and Deep Policy Gradient, both of them being extended to perform multiple actions simultaneously. The proposed approach was validated on the large-scale Pecan Street Inc. database. This highly-dimensional database includes information about photovoltaic power generation, electric vehicles as well as buildings appliances. Moreover, these on-line energy scheduling strategies could be used to provide realtime feedback to consumers to encourage more efficient use of electricity.