As technology constantly strengthens its presence in all aspects of human life, computing systems integrate a high number of processing cores, whereas applications become more complex and greedy for computational resources. Inevitably, this high increase in processing elements combined with the unpredictable resource requirements of executed applications at design time impose new design constraints to resource management of many-core systems, turning the distributed functionality into a necessity. In this work, we present a distributed runtime resource management framework for many-core systems utilizing a network-on-chip (NoC) infrastructure. Specifically, we couple the concept of distributed management with parallel applications by assigning different roles to the available computing resources. The presented design is based on the idea of local controllers and managers, whereas an on-chip intercommunication scheme ensures decision distribution. The evaluation of the proposed framework was performed on an Intel Single-Chip Cloud Computer, an actual NoC-based, many-core system. Experimental results show that the proposed scheme manages to allocate resources efficiently at runtime, leading to gains of up to 30% in application execution latency compared to relevant state-of-the-art distributed resource management frameworks.
This paper presents the cloud infrastructure of the AEGLE project, that targets to integrate cloud technologies together with heterogeneous reconfigurable computing in large scale healthcare systems for Big Bio-Data analytics. AEGLEs engineering concept brings together the hot big-data engines with emerging acceleration technologies, putting the basis for personalized and integrated health-care services, while also promoting related research activities. We introduce the design of AEGLE's accelerated infrastructure along with the corresponding software and hardware acceleration stacks to support various big data analytics workloads showing that through effective resource containerization AEGLE's cloud infrastructure is able to support high heterogeneity regarding to storage types, execution engines, utilized tools and execution platforms. Special care is given to the integration of high performance accelerators within the overall software stack of AEGLE's infrastructure, which enable efficient execution of analytics, up to 140× according to our preliminary evaluations, over pure software executions.
Lately, the advancement in circuit technology combined with the design of low cost embedded devices have resulted in an infiltration of the latter into everyday humans' lives. To exploit the full potential of ubiquitous embedded devices, a network is used for their inter-communication, offering advanced real-time monitoring. This paradigm, known as Internet of Things (IoT), is steadily consolidated and promises to offer a wide variety of applications. However, with the adoption of IoT, new challenges arise, such as the design of architectures able to support the requirements of the new applications. Towards this goal, we explore a three layered architecture, able to acquire, process and store Healthcare data as well as to provide real-time decision making. We use ECG signal arrhythmia detection as our use case evaluation scenario, and compare different techniques for wireless communication, storage and data classification. Experimental results show that, our architecture provides realtime decision making, with an average delay of 15 μs and that different communication technologies achieved to provide up to 10% lower power consumption on the monitoring devices. 1
Many-core systems are envisioned to leverage the ever-increasing demand for more powerful computing systems. To provide the necessary computing power, the number of Processing Elements integrated onchip increases and NoC based infrastructures are adopted to address the interconnection scalability. The advent of these new architectures surfaces the need for more sophisticated, distributed resource management paradigms, which in addition to the extreme integration scaling, make the new systems more prone to errors manifested both at hardware and software. In this work, we highlight the need for Run-Time Resource management to be enhanced with fault tolerance features and propose SoftRM, a resource management framework which can dynamically adapt to permanent failures in a self-organized, workload-aware manner. Self-organization allows the resource management agents to recover from a failure in a coordinated way by electing a new agent to replace the failed one, while workload awareness optimizes this choice according to the status of each core. We evaluate the proposed framework on Intel Single-chip Cloud Computer (SCC), a NoC based many-core system and customize it to achieve minimum interference on the resource allocation process. We showcase that its workload-aware features manage to utilize free resources in more that 90% of the conducted experiments. Comparison with relevant state-of-the-art fault tolerant frameworks shows decrease of up to 67% in the imposed overhead on application execution. CCS Concepts: • General and reference → Cross-computing tools and techniques; • Computer systems organization → Multicore architectures; • Networks → Network on chip; • Computing methodologies → Self-organization;
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.