Advancements in technology enable integration of multiple devices on a single core, resulting in increased on chip power and temperature densities. Higher temperatures, in turn, present a significant challenge for reliability. In this work we propose a comprehensive framework for analyzing reliability of multi-core systems, considering permanent faults. We show that aggressive power management can have an impact on reliability due to temperature cycling. Our cycle-accurate simulation methodology shows fine-grained variations of device failure rates over short time scales, thus enabling workload analysis and scheduling to control the reliability impact. On the other hand, the statistical reliability simulator and optimizer give a view into the long time horizon reliability analysis-over system lifetime, and help us optimize a power management policy under reliability and performance constraints. We show that our optimization strategy can achieve large power savings while still meeting the reliability and performance constraints.
Today's embedded systems integrate multiple IP cores for processing, communication, and sensing on a single die as systems-on-chip (SoCs). Aggressive transistor scaling, decreased voltage margins and increased processor power and temperature have made reliability assessment a much more significant issue. Although reliability of devices and interconnect has been broadly studied, in this work, we study a tradeoff between reliability and power consumption for component-based SoC designs. We specifically focus on hard error rates as they cause a device to permanently stop operating. We also present a joint reliability and power management optimization problem whose solution is an optimal management policy. When careful joint policy optimization is performed, we obtain a significant improvement in energy consumption (40%) in tandem with meeting a reliability constraint for all SoC operating temperatures.
Recent research on the robust and stochastic travelling salesman problem and the vehicle routing problem has seen many different approaches for describing the region of ambiguity, such as taking convex combinations of observed demand vectors or imposing constraints on the moments of the spatial demand distribution. One approach that has been used outside the transportation sector is the use of statistical metrics that describe a distance function between two probability distributions. In this paper, we consider a distributionally robust version of the Euclidean travelling salesman problem in which we compute the worst-case spatial distribution of demand against all distributions whose Wasserstein distance to an observed demand distribution is bounded from above. This constraint allows us to circumvent common overestimation that arises when other procedures are used, such as fixing the center of mass and the covariance matrix of the distribution. Numerical experiments confirm that our new approach is useful as a decision support tool for dividing a territory into service districts for a fleet of vehicle when limited data is available.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.