2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM) 2016
DOI: 10.1109/asonam.2016.7752342
|View full text |Cite
|
Sign up to set email alerts
|

Cyberbullying detection using probabilistic socio-textual information fusion

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
13
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 40 publications
(16 citation statements)
references
References 13 publications
0
13
0
Order By: Relevance
“…One of the main challenges in automated cyberbullying detection is the availability of cyberbullying-related data. From the datasets in TABLE 5, we can see that seven datasets contain 10% or less of cyberbullying-related (positive) samples [12], [20], [21], [78], [79], [83], [90], while only one dataset is almost balanced, with 42% positive samples and 58% negative samples [87]. Nine datasets have a percentage of positive samples between 11.7% and 29% [13], [15], [17], [53], [67], [78], [84]- [86], while the rest of the datasets contain between 30% and 39% of positive samples [8], [9], [14], [47].…”
Section: ) Data Annotationmentioning
confidence: 99%
See 3 more Smart Citations
“…One of the main challenges in automated cyberbullying detection is the availability of cyberbullying-related data. From the datasets in TABLE 5, we can see that seven datasets contain 10% or less of cyberbullying-related (positive) samples [12], [20], [21], [78], [79], [83], [90], while only one dataset is almost balanced, with 42% positive samples and 58% negative samples [87]. Nine datasets have a percentage of positive samples between 11.7% and 29% [13], [15], [17], [53], [67], [78], [84]- [86], while the rest of the datasets contain between 30% and 39% of positive samples [8], [9], [14], [47].…”
Section: ) Data Annotationmentioning
confidence: 99%
“…The imbalance of the datasets resulted in many researchers processing the datasets in order to ensure that the trained machine learning models learn to differentiate between cyberbullying cases and non-cyberbullying-related cases. Some works over-sampled the positive samples either by duplicating the positive samples multiple times in order to balance the dataset [20], [21], while other studies did the opposite by down-sampling negative samples in the dataset [53], [55], [83]. Some studies used search keywords on the streaming APIs to filter the incoming data and make sure to get more data with offensive content [12], [13], [67].…”
Section: ) Data Samplingmentioning
confidence: 99%
See 2 more Smart Citations
“…Authors showed that the accuracy of cyberbullying detection could be significantly improved using a Linear Support Vector Machine (SVM) classifier and by integrating multi-modal features from the picture, text, and metadata in the media session. Singh et al 31 proposed a framework to create better cyberbullying predicators by utilizing confidence score and interdependencies linked to different textual and social features. In contrast to an another approach in literature that used a similar features and dataset, the performance of the proposed approach contributed to major improvements in cyberbullying detection.…”
Section: Related Workmentioning
confidence: 99%