“…Over the last years, it has gained popularity due to its success in enhancing network-wide performance ( i.e ., the Quality of Service, QoS) [ 26 ], facilitating intelligent behavior by adapting to complex and dynamically changing (wireless) environments [ 27 ] and its ability to add automation for realizing concepts of self-healing and self-optimization [ 28 ]. During the past years, different learning approaches have been applied in various wireless networks schemes such as medium access control [ 29 , 30 ], routing [ 9 , 10 ], data aggregation and clustering [ 31 , 32 ], localization [ 33 , 34 ], energy harvesting communication [ 35 ], cognitive radio [ 36 , 37 ], etc . These schemes apply to a variety of wireless networks such as: mobile ad hoc networks [ 38 ], wireless sensor networks [ 18 ], wireless body area networks [ 39 ], cognitive radio networks [ 20 , 40 ] and cellular networks [ 41 ].…”