Cloud computing is an internet based computing. This computing paradigm has enhanced the use of network where the capability of one node can be utilized by other node. Cloud service provides access on demand to distributive resources such as database, servers, software, infrastructure etc. in pay as you go basis. Load balancing is one of the vexing issues in distributed environment. Resources of service provider need to balance the load of client request. Load balancing is adapted in order to increase the resource consumption in Data centers that leads to enhance the overall performance of system achieving client satisfaction.
Now-a-days Most of the industries are having large volumes of data. Data has range of Tera bytes to Peta byte. Organizations are looking to handle the growth of data. Enterprises are using cloud deployments to address the big data and analytics with respect to the interaction between cloud and big data. This paper presents big data issues and research directions towards the ongoing work of processing of big data in the distributed environments.
In today's world, most of the data (real world) is present in imbalanced form by nature. This is because of not having efficient algorithms to put this data (i.e., generated data by billion of internetconnected devices (IoTs)) in respective format. Imbalanced data poses a great challenge to (both) data mining and machine learning algorithms. The imbalanced dataset consists of a majority class and a minority class, where the majority class takes the lead over the minority class. Generally, several standard learning algorithms assume the balanced class distribution or equal misclassification costs. If prediction is performed by these learning algorithms on imbalanced data, the accuracy will be high for majority classes, i.e., resulting in poor performance. To overcome this problem (or improving accuracy of deision/prediction-making process), data mining and machine learning researchers have addressed the problem of imbalanced data using datalevel, algorithmic level and ensemble or hybrid methods. This article presents a systematic literature review and analyze the results of more than 400 research papers published between 2002-2017 (till June 2017), resulting in a broader and elaborate investigation of the literature in this area of research. Note that extension of this article/work will contain till December 2018 research articles, which will be published in June 2019 (now these more papers/articles did not include due to no. of pages/space issues). The systematic analysis of the research literature has focus on the key role of Data Intrinsic Problems in classification, handling the imbalanced data and the techniques used to overcome the skewed distribution. Furthermore, this article reveals patterns, trends and gaps in the existing literature and discusses briefly the next generation research directions in this area.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.