Zide Meng scite author profile

Abstract-More and more Internet companies rely on large scale data analysis as part of their core services for tasks such as log analysis, feature extraction or data filtering. Map-Reduce, through its Hadoop implementation, has proved to be an efficient model for dealing with such data. One important challenge when performing such analysis is to predict the performance of individual jobs. In this paper, we propose a simple framework to predict the performance of Hadoop jobs. It is composed of a dynamic light-weight Hadoop job analyzer, and a prediction module using locally weighted regression methods. Our framework makes some theoretical cost models more practical, and also well fits for the diversification of the jobs and clusters. It can also help those users who want to predict the cost when applying for an ondemand cloud service. At the end, we do some experiments to verify our framework.

show abstract

Joint Model of Topics, Expertises, Activities and Trends for Question Answering Web Applications

Meng

Gandon

Faron-Zucker

2016

View full text Add to dashboard Cite

HAL is a multidisciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L'archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d'enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

show abstract

A Game Theory Based MapReduce Scheduling Algorithm

Song

Meng

et al. 2013

View full text Add to dashboard Cite

Empirical study on overlapping community detection in question and answer sites

Meng

Gandon

Faron-Zucker

et al. 2014

View full text Add to dashboard Cite

In many social networks, people interact based on their interests. Community detection algorithms are then useful to reveal the sub-structures of a network and help us find interest groups. Identifying these social communities can bring benefit to understanding and predicting users behaviors. However, for some kind of online community sites such as question-and-answer (Q&A) sites or forums, there is no friendship based social network structure, which means people are not aware who they are in contact with. Therefore, many traditional community detection techniques do not apply directly. In this paper, we propose an empirical approach for extracting data from Q&A sites suitable to apply community detection methods. Then we compare three kinds of community detection methods we applied on a dataset extracted from the popular Q&A site StackOverflow. We analyze and comment the results of each method.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Zide Meng

A Practical Performance Model for Hadoop MapReduce

A Hadoop MapReduce Performance Prediction Method

Joint Model of Topics, Expertises, Activities and Trends for Question Answering Web Applications

A Game Theory Based MapReduce Scheduling Algorithm

Empirical study on overlapping community detection in question and answer sites

Contact Info

Product

Resources

About