The advent of manycore systems has led to the need for efficient dynamic thermal and reliability management techniques to increase system reliability. Increasing power density and thermal hotspots in manycore systems pose significant challenges to reliability and performance. Existing techniques often fail to scale effectively or consider long-term reliability impacts. This work aims to develop a lightweight and scalable management strategy for manycore systems that integrates dynamic thermal management (DTM) and dynamic reliability management (DRM) using application mapping and task migration. The primary contribution is the introduction of the FIT-aware Learning Heuristic for Application Allocation (FLEA), which leverages Q-learning to optimize task allocation and migration based on Failure In Time (FIT) monitoring. FLEA operates in two phases: a design phase that uses Q-learning to train a policy table (Q-table) and a runtime phase that utilizes this Q-table to make decisions on task allocation and migration. The Q-table is populated with values representing the best task deployment patterns, minimizing thermal hotspots and maximizing system reliability. The evaluation of FLEA demonstrates improvements over state-of-the-art techniques. FLEA effectively reduces the thermal amplitude, peak temperature, and spatial thermal distribution, resulting in enhanced Mean Time To Failure (MTTF) for the system.