Younghoon Kim scite author profile

There is a wide range of applications that require to query a large database of texts to search for similar strings or substrings. Traditional approximate substring matching requests a user to specify a similarity threshold. Without topk approximate substring matching, users have to try repeatedly different maximum distance threshold values when the proper threshold is unknown in advance.In our paper, we first propose the efficient algorithms for finding the top-k approximate substring matches with a given query string in a set of data strings. To reduce the number of expensive distance computations, the proposed algorithms utilize our novel filtering techniques which take advantages of q-grams and inverted q-gram indexes available. We conduct extensive experiments with real-life data sets. Our experimental results confirm the effectiveness and scalability of our proposed algorithms.

show abstract

Detection of Rapidly Spreading Hashtags via Social Networks

Kim

Seo

2020

IEEE Access

View full text Add to dashboard Cite

Social network services (SNSs) such as Twitter and Facebook have emerged as a new medium for communication. They offer a unique mechanism of sharing information by allowing users to receive all messages posted by those whom they ''follow''. As information in today's SNSs often spreads in the form of hashtags, detecting rapidly spreading hashtags in SNSs has recently attracted much attention. In this paper, we propose realistic epidemic models to describe the probabilistic process of hashtag propagation. Our models take into account the way how users communicate in SNSs; moreover the models consider the influence of external media and separate it from internal diffusion within networks. Based on the proposed models, we develop efficient inference algorithms that measure the propagation rates of hashtags in social networks. With real-life social network data including hashtags and synthetic data obtained by simulating information diffusion, we show that the proposed algorithms find fast-spreading hashtags more accurately than existing algorithms. Moreover, our in-depth case study demonstrates that our algorithms correctly find internal diffusion rates of hashtags as well as external media influences.

show abstract

Efficient processing of substring match queries with inverted variable-length gram indexes

Kim

Park

Shim

et al. 2013

Information Sciences

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Younghoon Kim

Parallel Top-K Similarity Join Algorithms Using MapReduce

TWITOBI: A Recommendation System for Twitter Using Probabilistic Modeling

Efficient top-k algorithms for approximate substring matching

Detection of Rapidly Spreading Hashtags via Social Networks

Efficient processing of substring match queries with inverted variable-length gram indexes

Contact Info

Product

Resources

About