Abstract-Despite recent advances that have greatly improved the performance of embedded systems, we still face many challenges with regard to energy consumption in energy-constrained embedded and communication platforms. Optimizing applications for energy consumption remains a challenge and thus is a compelling research direction, both on the practical and theoretical sides. This paper presents a new experimental bench for energy profiling of non-performance-critical embedded and mobile applications and reports preliminary results obtained on two embedded boards. The experiments are driven by an online energy monitoring mechanism using National Instruments' cDAQ and LabVIEW running on a host machine. The host monitors a target device, which runs a set of benchmarks. We describe the experience gained from using and modding two different target boards, namely an Nvidia Jetson TX1 and a TI AM572x evaluation module. In particular, we confirm, and thus further validate, the existence of the Energy/Frequency Convexity Rule for CPU-bound benchmarks. This rule states that there exists an optimal clock frequency that minimizes the CPU energy consumption for non-performance-critical applications. We also show that the gain of frequency scaling is highly dependent on workload characteristics. Any future energy-management approach should take these behavioral traits into consideration.