The development of applications is becoming every day more diverse in terms of requirements and objectives. Considering this, major cloud providers offer a large variety of hardware types to better fit their customers’ needs. Moreover, a common approach is to design applications in a distributed setup. This means that applications typically consist of building blocks performing individual tasks and communicating with each other to achieve a given objective. Similarly, in order to fulfill the overall requirements of the application, each of its building blocks could feature different instances (e.g., GPUs and FPGAs) as their individual requirements might differ from each other. In this work, we add energy efficiency on top of the multiple objectives that an application might have such as performance or security. Recent studies estimate the already alarming and still increasing energy consumption in the cloud computing sector, especially coming from data centers. We aim at tackling this problem from two different perspectives: 1) cloud providers, and 2) application developers. Looking from the cloud providers point of view, we start by proposing a novel scheduling strategy focused on single-tenant scenarios which leverages the heterogeneity of both clusters and applications to better address the trade-off between performance and energy efficiency. Then, we extend this approach to the multi-tenant scenario where we rely on differential approximation and sprinting to avoid job eviction and resource waste leading to an overall energy gain. In the second part of this thesis, we switch the view from cloud providers to the developers of applications. Here we show that peculiar characteristics of applications can also be leveraged to achieve more energy efficient approaches while maintaining or even further improving their remaining objectives. For that end, we use the automatization of parameter tuning for deep learning applications as use case. We start by characterizing such applications to motivate how relevant the tuning approach taken can be on the end results for tuning, training, and inference. After that, we propose a novel auto tuning approach which takes advantage of the high parallelism and recurring characteristics of such application to simultaneously consider performance, accuracy, and energy efficiency. Once the tuning process is complete, the final model is typically deployed to be used for inference. Therefore, our next step is to extend our approach to perform inference-aware tuning. With this complementary step, we additionally output to the users recommendations on how to deploy their models such that inference-related objectives are also achieved. Finally, independent of the perspective taken, security is one of the major concerns in this area. Cloud providers need to ensure data privacy to their customers, in particular for multi-tenant scenarios where resources are shared among several users. On the other hand, applications related to deep learning are often trained on user-sensitive data which also has to be protected. Recently, trusted executed environments (TEEs) became largely available in server machines as a mechanism to guarantee security. However, such approaches still have their limitations when it comes to usability and the overhead which could be added to applications. Considering this, we close this thesis by proposing an approach which can automate the process of ensuring security in the context of heterogeneous applications and clusters. In summary, this thesis aims at achieving more energy efficient computing while respecting other application requirements such as performance, security, or accuracy. We propose new mechanisms that can be generally applied during the management of jobs as well as optimizations which are specific to a deep learning use case application. Finally, we conclude this work by exploring a TEE based strategy for adding a security layer on top of the proposed approaches.