2019 IEEE International Conference on Big Data (Big Data) 2019
DOI: 10.1109/bigdata47090.2019.9006138
|View full text |Cite
|
Sign up to set email alerts
|

A novel oversampling method based on SeqGAN for imbalanced text classification

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
1
0
1

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 7 publications
(2 citation statements)
references
References 14 publications
0
1
0
1
Order By: Relevance
“…From the original dataset, to prepare data for oversampling, the data is manually observed to seek for a patch of data that contains highest frequency of small quantity class distributions like "action", "substance", and "material" to gain more weightage for these classes. This study chooses 700 data in sequence as the patch data so that the training will be more accurate as in Balakrishnan and Lloyd-Yemoh [23] and Luo et al, [29]. The data is from data number 700 to 1400 which is partitioned into 563 training data and 140 test data.…”
Section: The Steps Of Hmm Constructionmentioning
confidence: 99%
“…From the original dataset, to prepare data for oversampling, the data is manually observed to seek for a patch of data that contains highest frequency of small quantity class distributions like "action", "substance", and "material" to gain more weightage for these classes. This study chooses 700 data in sequence as the patch data so that the training will be more accurate as in Balakrishnan and Lloyd-Yemoh [23] and Luo et al, [29]. The data is from data number 700 to 1400 which is partitioned into 563 training data and 140 test data.…”
Section: The Steps Of Hmm Constructionmentioning
confidence: 99%
“…Методы балансировки подразделяются на три типа, а именно: сокращение количества объектов мажоритарного класса (undersampling), увеличение количества объектов миноритарного класса (oversampling) и гибридные методы. Первый подход подразумевает исключение некоторых данных мажоритарного класса (см., например, [18][19]); второй предполагает воспроизведение существующих экземпляров миноритарного класса либо создание новых [20][21][22], а гибридные методы направлены на объединение преимуществ обоих подходов [23]. Локальные и глобальные контексты учитываются современными архитектурами нейронных сетей для анализа текста.…”
Section: Introductionunclassified