Jaeseok Myung scite author profile

In this paper, we translate the multiplication of several matrices into a multi-way join operation among several relations. Matrix multiplication is widely used for many graph algorithms, such as those that calculate the transitive closure. These algorithms benefit from the multi-way join operation because this operation reduces the number of binary multiplications. Our implementation is based on the MapReduce framework, allowing us to provide scalable computation for large matrices. Although several papers have investigated matrix multiplication using MapReduce, this paper takes a different perspective. First, we expand the problem from binary multiplication to n-ary multiplication. For this reason, we apply the concept of parallelism, not only to an individual operation but also to the entire equation. Second, we represent a matrix as a relation consisting of (row, col, val) records and translate a multiplication into a join operation in database systems. This facilitates the efficient storage of sparse matrices, which are very common in real-world graph data, and the easy manipulation of matrices. Although this work is still in progress, we conducted a number of experiments to verify the idea. We also discuss current limitations and future works.

show abstract

Handling data skew in join algorithms using MapReduce

Myung

Shim

Yeon

et al. 2016

Expert Systems with Applications

View full text Add to dashboard Cite

PicAChoo: A Text Analysis Tool for Customizable Feature Selection with Dynamic Composition of Primitive Methods

Myung¹,

Yang²,

Lee³

2010

JSW

View full text Add to dashboard Cite

Although documents have hundreds of thousands of unique words, only a small number of words are significantly useful for text analysis. Thus, feature selection has become an important issue to be addressed in various text analysis studies. A number of techniques and algorithms for feature selection are available, but unfortunately, it is hard to say that a certain algorithm overcomes the others, because feature selection results mostly depend on the source documents. We should pick and choose the appropriate algorithm and the best subset of feature words whenever we need to analyze source documents. In this paper, we present a framework named ‘PicAChoo’, which stands for ‘Pick And Choose’ that enables customizable feature selection environments by composing several primitive feature selection methods without hard-coding. As indicated in the name, this framework provides many strategies for extracting appropriate features and allows dynamic compositions among several feature selection methods. In addition, it tries to give users an environment that utilizes linguistic characteristics of textual data, namely part-of-speech, sentence structures, and so on. Finally, we illustrate that selected feature words can be used for various intelligent services.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Jaeseok Myung

SPARQL basic graph pattern processing with iterative MapReduce

A General Maturity Model and Reference Architecture for SaaS Service

Matrix chain multiplication via multi-way join algorithms in MapReduce

Handling data skew in join algorithms using MapReduce

PicAChoo: A Text Analysis Tool for Customizable Feature Selection with Dynamic Composition of Primitive Methods

Contact Info

Product

Resources

About