2020 IEEE International Conference on Big Data (Big Data) 2020
DOI: 10.1109/bigdata50022.2020.9378048
|View full text |Cite
|
Sign up to set email alerts
|

Method for Customizable Automated Tagging: Addressing the problem of over-tagging and under-tagging text documents

Abstract: Using author provided tags to predict tags for a new document often results in the overgeneration of tags. In the case where the author doesn't provide any tags, our documents face the severe under-tagging issue. In this paper, we present a method to generate a universal set of tags that can be applied widely to a large document corpus. Using the IBM Watson's NLU service, first, we collect keywords/phrases that we call "complex document tags" from 8,854 popular reports in the corpus. We apply LDA model over th… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
3
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(3 citation statements)
references
References 5 publications
0
3
0
Order By: Relevance
“…One of the most important training challenges for deep neural networks, particularly recurrent neural networks (RNNs), is the vanishing gradient issue. Gradients that are transmitted backwards from the input layer to the output layer during training become very tiny, which is when this problem occurs (Pandya et al, 2020). These tiny gradients provide insignificant updates to the weights and biases of the preceding layers, which slow down training and may make it more difficult for the model to understand complex connections in the data.…”
Section: Vanishing Gradient Problemmentioning
confidence: 99%
See 1 more Smart Citation
“…One of the most important training challenges for deep neural networks, particularly recurrent neural networks (RNNs), is the vanishing gradient issue. Gradients that are transmitted backwards from the input layer to the output layer during training become very tiny, which is when this problem occurs (Pandya et al, 2020). These tiny gradients provide insignificant updates to the weights and biases of the preceding layers, which slow down training and may make it more difficult for the model to understand complex connections in the data.…”
Section: Vanishing Gradient Problemmentioning
confidence: 99%
“…Prophet Forecasting models are commonly used for making predictions. Simply put, this type of model says that future values depend on past values and some random factor(Pandya, Reyes, & Vanderheyden, 2020). Facebook made a popular model called Prophet to help with this.…”
mentioning
confidence: 99%
“…This records various pieces of information about each article, such as the number of replies or reposts, used as a reference for subsequent event graph drawing and the judgment of abnormal situations [27].…”
Section: ) Text Tagging Analysismentioning
confidence: 99%