Automatically generated spam detection based on sentence-level topic information

Suhara, Yoshihiko; Toda, Hiroyuki; Nishioka, Shuichi; Susaki, Seiji

doi:10.1145/2487788.2488140

Cited by 10 publications

(4 citation statements)

References 15 publications

(13 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…survey research on the method of topic link detection based on improved information bottleneck theory [10], in this paper, a method of representing text is proposed, which can divide text into several sections of sub-topic features based on the regular pattern of semantic distribution and improve information bottleneck theory, then, the text represented by the attributes is utilized to do topic link detection, the experimental results have shown that this method has a fast convergent rate, and can improve the performance of topic link detection system. Suhara, Yoshihiko and others survey research on the method of information detection based on sentence-level topic [11], in this paper, the text sentence-level diversity features based on the probabilistic topic model is proposed, an information content classifier is also constructed combining features proposed, the experimental results show that this method outperforms the conventional methods. Pang, JB and others survey research on the method of unsupervised web topic detection using a ranked clustering-like pattern across similarity cascades [12], in this paper, a method using a clusteringlike pattern across similarity cascades is investigated from the perspective of similarity diffusion, a topic-restricted similarity diffusion process is also proposed to identify real topic from a large number of candidates efficiently, the experimental results demonstrate that this approach outperforms the state-of-the-art methods on several public data sets, those works are related to author's research direction of network topic detection and application.…”

Section: Related Workmentioning

confidence: 93%

Model of Network Topic Detection Based on Web Usage Behaviour Mode Analysis and Mining Technology

Chen¹

2017

INT J COMPUT COMMUN

View full text Add to dashboard Cite

show abstract

Section: Related Workmentioning

confidence: 93%

Model of Network Topic Detection Based on Web Usage Behaviour Mode Analysis and Mining Technology

Chen¹

2017

INT J COMPUT COMMUN

View full text Add to dashboard Cite

show abstract

“…The rational for using topic modeling is that spams have more unusual topic distributions than non-spam messages. For example, Suhara et al (2013) developed a sentence level LDA to assign topics to sentences for web spam detection. Biro et al, (2009) defined a threshold based on the outputs of LDA to distinguish between spam and non-spam.…”

Section: Related Workmentioning

confidence: 99%

Exploiting latent content based features for the detection of static SMS spams

Karami

Zhou

2014

Proc of Assoc for Info

View full text Add to dashboard Cite

As the use of mobile phones grows, spams are becoming increasingly common in mobile communication such as SMS, calling for research on SMS spam detection. Existing detection techniques for SMS spams have been mostly adapted from those developed for other contexts such as emails and the web without taking into account some unique characteristics of SMS. Additionally, spamming tactics is constantly evolving, making existing methods for spam detection less effective. In this research, we propose to exploit latent content based features for the detection of static SMS spams. The efficacy of the proposed features is empirically validated using multiple classification methods. The results demonstrate that the proposed features significantly improve the performance of SMS spam detection.

show abstract

“…In [53] the authors extracted features based on sentence-level topic information. They first created LDA [11] with a ham corpus and apply it to the unseen documents to infer the topic distribution of the sentences.…”

Section: Natural Language Processing Approachmentioning

confidence: 99%

Approaches for Web Spam Detection

Hans¹,

Ahuja²,

Muttoo³

2014

IJCA

View full text Add to dashboard Cite

Spam is a major threat to web security. The web of trust is being abused by the spammers through their ever evolving new tactics for their personal gains. In fact, there is a long chain of spammers who are running huge business campaigns under the web. Spam causes underutilization of search engine resources and creates dissatisfaction among web community. Web Security being a prime challenge for search engines has motivated the researchers in academia and industry to devise new techniques for web spam detection. In this paper we present a comprehensive survey of techniques for detection of web spam and discuss their applicability and performance in various scenarios where they outperformed the others. We have categorized web spam detection with the primary focus on the approaches used for spam detection. The paper also gives the possible directions for future work.

show abstract

Automatically generated spam detection based on sentence-level topic information

Cited by 10 publications

References 15 publications

Model of Network Topic Detection Based on Web Usage Behaviour Mode Analysis and Mining Technology

Model of Network Topic Detection Based on Web Usage Behaviour Mode Analysis and Mining Technology

Exploiting latent content based features for the detection of static SMS spams

Approaches for Web Spam Detection

Contact Info

Product

Resources

About