Light fidelity (LiFi) is an emerging communication technology, which utilizes the lightemitting diodes (LEDs) for high-speed wireless communications. Due to its huge unlicensed bandwidth, LiFi is capable of supporting high data rates. The quality of the LiFi channel fluctuates across the room due to interference, reflection from walls or blockage. On the other hand, WiFi is another wireless communication technology that is capable of providing moderate data rates with ubiquitous coverage. As the electromagnetic spectrum of LiFi does not overlap with WiFi, both of them can coexist to form a hybrid LiFi and WiFi network for seamless and high-throughput connectivity. The performance of a hybrid system significantly depends upon the access point (AP) assignment and resource allocation strategies. In this paper, a downlink hybrid system with one WiFi AP and four LiFi APs is considered, and a reinforcement learning (RL) algorithm is implemented in order to determine an optimal AP assignment strategy, which maximizes the long-term system throughput while ensuring the required users fairness and satisfaction. Furthermore, two different scenarios based on the random waypoint model with uniform and non-uniform distribution of users have been studied. The performance of the proposed system is compared against state-of-the-art benchmark approaches e.g., signal strength strategy (SSS), exhaustive search, and an iterative optimization method. The results are reported in terms of the average system throughput, user satisfaction, fairness, and capacity outage probability. It is shown that the proposed RL method performs closer to the exhaustive search scheme at fairly low complexity. The RL method also outperforms the SSS scheme and the iterative algorithm in most scenarios INDEX TERMS Hybrid LiFi WiFi, Light Fidelity (LiFi), Load balancing, Reinforcement learning (RL), Trust region policy optimization (TRPO).