Web Spam Detection by Learning from Small Labeled Samples

Karimpour, Jaber; Noroozi, Ali A.; Alizadeh, Somayeh

doi:10.5120/7924-0993

Cited by 15 publications

(2 citation statements)

References 14 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…To automatically determine the class labels (trust/distrust) of unlabelled response tweets, we adopted an expectation-maximization (EM) based semisupervised classifier (Nigam et al, 2006 ; Karimpour et al, 2012 ). EM is an iterative algorithm to maximize a posteriori estimation in datasets with both labeled and unlabeled data (Nigam et al, 2000 ).…”

Section: Methodsmentioning

confidence: 99%

Trust and Engagement on Twitter During the Management of COVID-19 Pandemic: The Effect of Gender and Position

et al. 2022

View full text Add to dashboard Cite

During the COVID-19 pandemic, health and political leaders have attempted to update citizens using Twitter. Here, we examined the difference between environments that social media has provided for male/female or health/political leaders to interact with people during the COVID-19 pandemic. The comparison was made based on the content of posts and public responses to those posts as well as user-level and post-level metrics. Our findings suggest that although health officers and female leaders generated more contents on Twitter, political leaders and male authorities were more active in building networks. Offensive language was used more frequently toward males than females and toward political leaders than health leaders. The public also used more appreciation keywords toward health leaders than politicians, while more judgmental and economy-related keywords were used toward politicians. Overall, depending on the gender and position of leaders, Twitter provided them with different environments to communicate and manage the pandemic.

show abstract

Section: Methodsmentioning

confidence: 99%

Trust and Engagement on Twitter During the Management of COVID-19 Pandemic: The Effect of Gender and Position

et al. 2022

View full text Add to dashboard Cite

show abstract

“…Out of 627 product reviews, 10 r eviews are found to be abusive, a nd hence, removed, and 48 are fo und to be spam PILAKA ANUSHA [9] Hadoop 1 represents the description of the classifiers, datasets and results used from the research work. Based on research work most of them used the SVM and naive bayes classifier, for different types of datasets.…”

Section: Hotel Datasetmentioning

confidence: 99%

Detection of Fake Online Reviews using Semi-supervised and Supervised learning

Yashaswini¹

2022

IJRASET

View full text Add to dashboard Cite

Nowadays, when somebody wants to make some decisions about a product or a service everyone goes with the reviews as it has become an essential part of decision making. When a customer wants to order a product on an e commerce website firstly everyone checks the review section in detail and further proceeds for decision making about the product. If the reviews posted were satisfactory for the customer he may order the product thus reviews become a reputed parameter for the businesses and companies and also a great source of information for the customers. Every customer thinks that the reviews he/she is seeing is authentic and any manipulation in that from any individuals or any rival companies which may lead to fake data which will be labeled as fake reviews. This type of attempt if not noticed may let us think about the gen-unity of the data. So these reviews are the most important parameter for the businesses and companies. There exist some groups or persons who make use of these reviews to forge customers for their own interest or damage their competitors reputation. In order to solve this problem we uses Machine learning techniques(Supervised and semi-supervised) to detect whether the given review is fake or not with high accuracy. Along with this objective we also focus on developing models which need less data to train.Since we can’t always be able to get labeled data we use semi-supervised machine learning to make use of unlabeled data.It is understandable our model should be capable of giving results in reasonably less time. .In this paper we proposed many classification algorithm like Support Vector Machine algorithm (SVM) , Random Forest algorithm (RF) and deep neural network

show abstract