Foraging is an essential behavior for animal survival and requires both learning and decision-making skills. However, despite its relevance and ubiquity, there is still no effective mathematical framework to adequately estimate foraging performance that also takes interindividual variability into account. In this work, foraging performance is evaluated in the context of multi-armed bandit (MAB) problems by means of a biological model and a machine learning algorithm. Siamese fighting fish (Betta splendens) were used as a biological model and their ability to forage was assessed in a four-arm cross-maze over 21 trials. It was observed that fish performance varies according to their basal cortisol levels, i.e., a reduced average reward is associated with low and high levels of basal cortisol, while the optimal level maximizes foraging performance. In addition, we suggest the adoption of the epsilon-greedy algorithm to deal with the exploration-exploitation tradeoff and simulate foraging decisions. The algorithm provided results closely related to the biological model and allowed the normalized basal cortisol levels to be correlated with a corresponding tuning parameter. The obtained results indicate that machine learning, by helping to shed light on the intrinsic relationships between physiological parameters and animal behavior, can be a powerful tool for studying animal cognition and behavioral sciences.
A ROV (Remotely Operated underwater Vehicle) system needs a precise estimation of its position in order to avoid damage and imprecise movements in certain missions. In this paper, an intelligent controller for the trajectory tracking of a ROV susceptible to external forces (water dynamics, currents, animals, etc.) is proposed based on feedback linearization with an adaptative function optimized using a machine learning strategy. This work proposes an approach to the matter employing an reinforcement learning algorithm, the-greedy, to let him discover for itself an optimal learning rate for the used compensator. Numerical results confirm a strong improvement in the performance of the controller when the proposed compensator and learning strategy is inserted. Resumo: Um sistema ROV (Remotely Operated underwater Vehicle) precisa de uma estimação precisa da sua posição para evitar danos e movimentos imprecisos para certas tarefas. Neste artigo, um controlador inteligente foi proposto para o rastreamento da trajetória de um ROV suscetível a forças externas (dinâmica daágua, correntes, animais, etc.), este controlador foi baseado no método de linearização por realimentação com uma função adaptativa otimizada utilizando uma estratégia de aprendizagem de máquina. Este trabalho propõe o emprego de um algoritmo de aprendizagem por reforço, o-greedy, para o deixar descobrir por si só uma taxa de aprendizagemótima para o compensador. Resultados numéricos confirmam uma forte melhora na performance do controlador quando o compensador proposto e a estratégia de aprendizageḿ e inserida.
Linearização por realimentação e aprendizagem por reforço para o controle do sistema de posicionamento do ROV Feedback linearization and reinforcement learning for controlling the positioning system of a ROV
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.