Varying channel conditions, dynamic traffic flows, interference and congestion are the main challenges to achieve high-throughput data delivery in multi-radio multi-channel Wireless Mesh Networks (WMNs). The performance of existing solutions are limited either for using statically computed end-to-end relay paths or myopic forwarding decisions. In this paper, we consider the problem of high-throughput traffic forwarding that involves good quality link selection, channel allocation and power control at each forwarding router. Every router chooses a set of outgoing link-channel pairs and their power allocations through a mixedinteger-non-linear programming (MINLP) solution that maximizes its sum total outgoing flow rate while keeping interference and congestion at minimum. Since, the MINLP optimization function is an NP-hard one, a Reinforcement learning based system for Link-Channel selection and Power allocation, namely RLCP, is developed. A comprehensive design of the RLCP system has been presented containing portrayal of the system state, design of a reputation metric and a mechanism to learn the control policy. We have carried out exhaustive simulations on NS-3 and found the proposed RLCP system to prove its efficacy in terms of aggregated throughput, flow fairness and packet delivery delay. INDEX TERMS High-throughput, link-channel selection, power allocation, reinforcement learning, wireless mesh networks.