Managing cloud applications is complex, and the current state of the art is not addressing this issue. The ever-growing software ecosystem continues to increase the knowledge required to manage cloud applications at a time when there is already an IT skills shortage. Solving this issue requires capturing IT operation knowledge in software so that this knowledge can be reused by system administrators who do not have it. The presented research tackles this issue by introducing a new and fundamentally different way to approach cloud application management: a hierarchical collection of independent software agents, collectively managing the cloud application. Each agent encapsulates knowledge of how to manage specific parts of the cloud application, is driven by sending and receiving cloud models, and collaborates with other agents by communicating using conversations. The entirety of communication and collaboration in this collection is called the orchestrator conversation. A thorough evaluation shows the orchestrator conversation makes it possible to encapsulate IT operations knowledge that current solutions cannot, reduces the complexity of managing a cloud application, and happens inherently concurrent. The evaluation also shows that the conversation figures out how to deploy a single big data cluster in less than 100 milliseconds, which scales linearly to less than 10 seconds for 100 clusters, resulting in a minimal overhead compared with the deployment time of at least 20 minutes with the state of the art.
INTRODUCTIONManaging cloud applications is complex. System administrators (sysadmins) need to have an in-depth understanding of all the components of the cloud application such as the operating system, webserver, and X.509 certificates. Having such deep knowledge about how to deploy, configure, monitor, and manage these components is almost impossible in the field of big data because of the size of the ecosystem, the complexity of the tools involved, and the rapid pace of innovation. This would not be such a big problem if it was not for the large skills shortage in the fields of IT operations 1 and big data. 2 There is thus a need for the ability to share and reuse the knowledge of sysadmins across teams and companies.Sharing IT knowledge is not a new concept. The field of software development, for example, has a big focus on sharing and reusing knowledge in the form of code libraries. Over the years, a vast number of code libraries have been created that encapsulate an enormous amount of knowledge. Developers use these libraries to quickly write software without having Int J Network Mgmt. 2018;28:e2036.wileyonlinelibrary.com/journal/nem