Afaf G. Bin Saadon scite author profile

The platform will undergo maintenance on Sep 14 at about 7:45 AM EST and will be unavailable for approximately 2 hours.

Afaf G. Bin Saadon

3Publications

1Citation Statement Received

40Citation Statements Given

How they've been cited

How they cite others

Affiliations

Cairo University

Publications

Order By: Most citations

iiHadoop: an asynchronous distributed framework for incremental iterative computations

Saadon

Mokhtar

2017

J Big Data

View full text Add to dashboard Cite

IntroductionToday, a large amount of data is being produced in many areas including: e-commerce, social network, finance, health-care, and education. This increase in data volume consequently increases the need for an efficient computing framework to process this data and transform it into meaningful information. In the past years, many distributed computing frameworks [1][2][3][4][5][6] have been developed to perform large-scale data processing. MapReduce [2] (with its open-source implementation, Hadoop [7]) is the most widely used framework because of its simplicity, scalability, efficiency, and reliability. It was proposed by Google in 2004 to enable the distribution and processing of large-scale data over a cluster of commodity machines. The framework takes care of the distributed execution, fault tolerance, load balancing, and job scheduling; therefore, developers and programmers do not need to worry about these issues and can simply perform their tasks. AbstractIt is true that data is never static; it keeps growing and changing over time. New data is added and old data can either be modified or deleted. This incremental nature of data motivates the development of new systems to perform large-scale data computations incrementally. MapReduce was recently introduced to provide an efficient approach for handling large-scale data computations. Nevertheless, it turned to be inefficient in supporting the processing of small incremental data. While many previous systems have extended MapReduce to perform iterative or incremental computations, these systems are still inefficient and too expensive to perform large-scale iterative computations on changing data. In this paper, we present a new system called iiHadoop, an extension of Hadoop framework, optimized for incremental iterative computations. iiHadoop accelerates program execution by performing the incremental computations on the small fraction of data that is affected by changes rather than the whole data. In addition, iiHadoop improves the performance by executing iterations asynchronously, and employing locality-aware scheduling for the map and reduce tasks taking into account the incremental and iterative behavior. An evaluation for the proposed iiHadoop framework is presented using examples of iterative algorithms, and the results showed significant performance improvements over comparable existing frameworks.

show abstract

Survey on iterative and incremental approaches in distributed computing environment

Saadon

Mokhtar

2019

IJDS

View full text Add to dashboard Cite

Iterative computation has become increasingly needed for a large and important class of applications such as machine learning and data mining. These iterative applications typically apply computations over large-scale datasets. So it is desirable to develop efficiently distributed frameworks to process data iteratively. On the other hand, data keeps growing over time as new entries are added and existing entries are deleted or modified. This incremental nature of data makes the previously computed results of iterative applications stale and inaccurate over time. It is hence necessary to periodically refresh the computation so that the new changes can be quickly reflected in the computed results. This paper presents the existing distributed systems that support iterative and incremental computations on large-scale datasets. It describes the main optimisations and features of these systems and identifies their limitations.

show abstract

Survey on iterative and incremental approaches in distributed computing environment

Saadon

Mokhtar

2019

IJDS

View full text Add to dashboard Cite

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Afaf G. Bin Saadon

iiHadoop: an asynchronous distributed framework for incremental iterative computations

Survey on iterative and incremental approaches in distributed computing environment

Survey on iterative and incremental approaches in distributed computing environment

Contact Info

Product

Resources

About