Wathsala Anupama Mohotti scite author profile

Wathsala Anupama Mohotti

Sign up to set email alerts

|

14Publications

21Citation Statements Received

29Citation Statements Given

How they've been cited

How they cite others

Affiliations

University of Ruhuna, Queensland University of Technology

Publications

Order By: Most citations

Multi-type Relational Data Clustering for Community Detection by Exploiting Content and Structure Information in Social Networks

¹

,

²

,

³

et al. 2019

View full text Add to dashboard Cite

An effective short-text topic modelling with neighbourhood assistance-driven NMF in Twitter

¹

,

²

2022

Soc. Netw. Anal. Min.

View full text Add to dashboard Cite

Social media such as Twitter connect billions of people by allowing them to exchange their thoughts via short-text communication. Topic modelling is a widely used technique for analysing short texts. Discovering topic clusters in short-text collections faces issues with distance-based, density-based and dimensionality reduction-based methods due to their higher dimensionality and short length which results in extremely sparse text representation matrices. We propose the ‘neighbourhood-based assistance’-driven non-negative matrix factorization (NMF) method to handle high-dimensional sparse short-text representation with lower-dimensional projection effectively. We utilized NMF that aligned with the natural non-negativity of text data coupled with the symmetric document affinity information to identify topic distribution in the short text. Neighbourhood information within documents is captured using Jaccard similarity to assist information loss, resulting in higher-to-lower-dimensional projection. Experimental results with Twitter data sets show that the proposed approach is able to attain high accuracy compared to state-of-the-art methods quantitatively, while qualitative analysis with case studies validates the ability of the proposed approach in generating meaningful topic clusters.

Efficient Outlier Detection in Text Corpus Using Rare Frequency and Ranking

¹

,

²

2020

ACM Trans. Knowl. Discov. Data

View full text Add to dashboard Cite

Outlier detection in text data collections has become significant due to the need of finding anomalies in the myriad of text data sources. High feature dimensionality, together with the larger size of these document collections, presents a need for developing accurate outlier detection methods with high efficiency. Traditional outlier detection methods face several challenges including data sparseness, distance concentration, and the presence of a larger number of sub-groups when dealing with text data. In this article, we propose to address these issues by developing novel concepts such as presenting documents with the rare document frequency, finding ranking-based neighborhood for similarity computation, and identifying sub-dense local neighborhoods in high dimensions. To improve the proposed primary method based on rare document frequency, we present several novel ensemble approaches using the ranking concept to reduce the false identifications while finding the higher number of true outliers. Extensive empirical analysis shows that the proposed method and its ensemble variations improve the quality of outlier detection in document repositories as well as they are found scalable compared to the relevant benchmarking methods.

Can We Define Design? Analyzing Twenty Years of Debate on a Large Email Discussion List

¹

,

²

,

³

et al. 2021

She Ji: The Journal of Design, Economics, and Innovation

View full text Add to dashboard Cite

Concept Mining in Online Forums Using Self-corpus-Based Augmented Text Clustering

¹

,

²

,

³

2019

View full text Add to dashboard Cite

Corpus-Based Augmented Media Posts with Density-Based Clustering for Community Detection

¹

,

²

2018

View full text Add to dashboard Cite

Unsupervised text mining: Effective similarity calculation with ranking and matrix factorization

View full text Add to dashboard Cite

An Efficient Ranking-Centered Density-Based Document Clustering Method

¹

,

²

2018

View full text Add to dashboard Cite

12

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Copyright © 2024 scite LLC. All rights reserved.

Made with 💙 for researchers

Part of the Research Solutions Family.