The deployment of dense networks of small base stations represents one of the most promising solutions for future mobile networks to meet the foreseen increasing traffic demands. However, such an infrastructure consumes a considerable amount of energy, which, in turn, may represent an issue for the environment and the operational expenses of the mobile operators. The use of renewable energy to supply the small base stations has been recently considered as a mean to reduce the energy footprint of the mobile networks. In this paper, we consider a hierarchical structure in which part of the base stations are powered exclusively by solar panels and batteries. Base stations are grouped in clusters and connected in a micro-grid. A central controller enables base station sleep mode and energy sharing among the base stations based on the available energy budget and the traffic demands. We propose three different implementations of the controller through Machine Learning models, namely Imitation Learning, Q-Learning and Deep Q-Learning, capable of learning optimal sleep mode and energy sharing policies. We provide an exhaustive discussion on the achieved performance, complexity and feasibility of the proposed models together with the energy and cost savings attained.