“…The Q-learning algorithm is adopted in the cognitive controller for ensuring that the optimal policy is adhered to by representing the action-reward space as an HMM, and for further details on the implementation of the proposed approach, we refer the reader to the studies [118]. The results of the proposed method were compared against sever other state-of-the-art trackers, including the mean-shift algorithm [119], the fusion filter [120], which uses a covariance matrix trace-based fusion scheme, a modified particle tracker [121], and the least soft-threshold squares tracker [122]. With a dataset consisting of public real image sequences, the proposed technique was demonstrated to achieve tracking results with mses comparable to the other mentioned techniques, despite being a suboptimal implementation [118].…”