Collecting signatures to model latency tolerance in high-level simulations of microthreaded cores Irfan-Ud-Din, M.; Jesshope, C.R.; van Tol, M.W.; Poss, R.C.
General rightsIt is not permitted to download or to forward/distribute the text or part of it without the consent of the author(s) and/or copyright holder(s), other than for strictly personal, individual use, unless the work is under an open content license (like Creative Commons).
Disclaimer/Complaints regulationsIf you believe that digital publication of certain material infringes any of your rights or (privacy) interests, please let the Library know, stating your reasons. In case of a legitimate complaint, the Library will make the material inaccessible and/or remove it from the website. Please Ask the Library: http://uba.uva.nl/en/contact, or a letter to: Library of the University of Amsterdam, Secretariat, Singel 425, 1012 WP Amsterdam, The Netherlands. You will be contacted as soon as possible.
ABSTRACTThe current many-core architectures are generally evaluated by a detailed emulation with a cycle-accurate simulation of the execution time. However this detailed simulation of the architecture makes the evaluation of large programs very slow. Since the focus in many-core architecture is shifting from the performance of the individual core to the overall behavior of chip, high-level simulations are becoming necessary, which evaluate the same architecture at less detailed level and allow the designer to make quick and reasonably accurate design decisions. We have developed a high-level simulator for the design space exploration of the Microgrid, which is a many-core architecture comprised of many finegrained multi-threaded cores. This simulator allows us to investigate mapping and scheduling strategies of families (i.e. groups of threads) in developing an operating environment for the Microgrid. The previous method to evaluate the workload counted in basic blocks was inaccurate. The key problem is that with many concurrent threads the latency of certain instructions are hidden because of the multithreaded nature of the core. This paper presents a technique to manage the execution time of different types of instructions with thread concurrency. We believe to achieve high accuracy in evaluating programs in the high-level simulator.