Abstract-Machine learning algorithms are designed to resolve unknown behaviours by extracting commonalities over massive datasets. Unfortunately, learning such global behaviours can be inaccurate and slow for systems composed of heterogeneous elements, which behave very differently, for instance as it is the case for cyber-physical systems and Internet of Things applications. Instead, to make smart decisions, such systems have to continuously refine the behaviour on a per-element basis and compose these small learning units together. However, combining and composing learned behaviours from different elements is challenging and requires domain knowledge. Therefore, there is a need to structure and combine the learned behaviours and domain knowledge together in a flexible way. In this paper we propose to weave machine learning into domain modeling. More specifically, we suggest to decompose machine learning into reusable, chainable, and independently computable small learning units, which we refer to as micro learning units. These micro learning units are modeled together with and at the same level as the domain data. We show, based on a smart grid case study, that our approach can be significantly more accurate than learning a global behaviour while the performance is fast enough to be used for live learning.