Fault tolerance and the availability of applications, computing infrastructure, and communications systems during unexpected events are critical in cloud environments. The microservices architecture, and the technologies that it uses, should be able to maintain acceptable service levels in the face of adverse circumstances. In this paper, we discuss the challenges faced by cloud infrastructure in relation to providing resilience to applications. Based on this analysis, we present our approach for a software platform based on a microservices architecture, as well as the resilience mechanisms to mitigate the impact of infrastructure failures on the availability of applications. We demonstrate the capacity of our platform to provide resilience to analytics applications, minimizing service interruptions and keeping acceptable response times.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations鈥揷itations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.