Question Answering is one of the main current topics in natural language processing, as it can be used in many different applications. This project proposes an original architecture to solve open domain and multi-hop questions between texts and tables, using the OTT-QA dataset for validation and training. To answer such questions, it is necessary to search for information in a large corpus by going through several excerpts and tables, as the answer may not be found directly; it is necessary to reason over multiple passages. One of the most common solutions is retrieving information sequentially, where a selected text helps search for the next. As different models can have different functions in this iterative information search, a challenge is how to coordinate them, given that there is no labeled data of the path to be followed. Our architecture uses a model trained through reinforcement learning to choose between different state-of-the-art tools sequentially until, in the end, a block is selected as responsible for generating the answer. Our system achieved an F1-score of 19.03, a value compatible with similar iterative systems in the literature.