Big data processing technology marks a prominent place in today's market. Hadoop is an efficient open-source distributed framework used to process big data with fewer expenses utilizing a cluster of commodity machines (nodes). In Hadoop, YARN got introduced for effective resource utilization among the jobs. Still, YARN over-allocates the resources for some tasks of a job and keeps the cluster resources underutilized. This paper has investigated the CAPACITY and FAIR schedulers' practical utilization of resources in a multi-tenancy shared environment using the HiBench benchmark suite. It compares the above MapReduce job schedulers' performance in two scenarios and proposes some open research questions (ORQ) with potential solutions to help the upcoming researchers. On average, the authors found that CAPACITY and FAIR schedulers utilize 77% of RAM and 82% of CPU cores. Finally, the experimental evaluation proves that these schedulers over-allocate the resources for some of the tasks and keep the cluster resources underutilized in different scenarios.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.