Fˆ2-Softmax: Diversifying Neural Text Generation via Frequency Factorized Softmax

Choi, Byungju; Hong, Jimin; Park, David K.; Lee, Sang Wan

doi:10.18653/v1/2020.emnlp-main.737

Cited by 11 publications

(9 citation statements)

References 31 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…As a standard approach to training a neural text generation model, MLE has been proved to be defective. Choi et al (2020) have shown that MLE may mislead the model because of the imbalanced token distribution. Thus, they design a greedy approach MefMax and factorize Softmax to ensure a balanced training according to the word frequency.…”

Section: Training-based Methodsmentioning

confidence: 99%

“…We performed experiments on the Wikitext-103 1 dataset (Merity et al, 2017), a large-scale benchmark containing more than 29 thousand Wikipedia articles with over 100 million words in total. Wikitext-103 has been widely used in many language modeling models (Welleck et al, 2020;Martins et al, 2020;Choi et al, 2020), but in order to train our POS guided Softmax, we need the corresponding POS tags. We use the Stanford CoreNLP's POS tagger 2 (Manning et al, 2014) to annotate words in Wikitext-103 with XPOS 3 tags (Hornby et al, 2017).…”

Section: Datasetmentioning

confidence: 99%

“…Holtzman et al (2020) think that MLE can not adequately capture the rich diversity and expression in human language. Choi et al (2020) argue that the imbalanced token distribution inherent in natural language even worsens the low-diversity problem. Based on these analysis, many training-based methods have been proposed.…”

Section: Introductionmentioning

confidence: 99%

“…Jiang et al (2019) propose to utilize dynamically scaling losses conditioned on the token frequency in the training phase. Choi et al (2020) factorize the probability distribution and design an elaborate token cluster algorithm for a balanced training.…”

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Diversifying Neural Text Generation with Part-of-Speech Guided Softmax and Sampling

Yang¹,

Xu²,

Wan³

2021

Preprint

View full text Add to dashboard Cite

Neural text generation models are likely to suffer from the low-diversity problem. Various decoding strategies and training-based methods have been proposed to promote diversity only by exploiting contextual features, but rarely do they consider incorporating syntactic structure clues. In this work, we propose using linguistic annotation, i.e., part-of-speech (POS), to guide the text generation. In detail, we introduce POS Guided Softmax (POSG-Softmax) to explicitly model two posterior probabilities: (i) next-POS, and (ii) nexttoken from the vocabulary of the target POS. A POS guided sampling strategy is further proposed to address the low-diversity problem by enriching the diversity of POS. Extensive experiments and human evaluations demonstrate that, compared with existing state-of-the-art methods, our proposed methods can generate more diverse text while maintaining comparable quality.

show abstract

Section: Training-based Methodsmentioning

confidence: 99%

Section: Datasetmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Diversifying Neural Text Generation with Part-of-Speech Guided Softmax and Sampling

Yang¹,

Xu²,

Wan³

2021

Preprint

View full text Add to dashboard Cite

show abstract

“…Thus, the models often suffer from generating diverse outputs. This has been addressed using different techniques, such as unlikelihood training (Welleck et al, 2020) and F 2 -Softmax (Choi et al, 2020). Clarification utility maximization (next subsection) also implicitly addresses this issue.…”

Section: Sequence-to-sequence Modelsmentioning

confidence: 99%

Conversational Information Seeking

Zamani¹,

Trippas²,

Dalton³

et al. 2022

Preprint

View full text Add to dashboard Cite

Conversational information seeking (CIS) is concerned with a sequence of interactions between one or more users and an information system. Interactions in CIS are primarily based on natural language dialogue, while they may include other types of interactions, such as click, touch, and body gestures. This monograph provides a thorough overview of CIS definitions, applications, interactions, interfaces, design, implementation, and evaluation. This monograph views CIS applications as including conversational search, conversational question answering, and conversational recommendation. Our aim is to provide an overview of past research related to CIS, introduce the current state-of-the-art in CIS, highlight the challenges still being faced in the community. and suggest future directions.

show abstract

Abstractive Summarization Model with Adaptive Sparsemax

Guo

Zhao

2022

Lecture Notes in Computer Science

View full text Add to dashboard Cite

Fˆ2-Softmax: Diversifying Neural Text Generation via Frequency Factorized Softmax

Cited by 11 publications

References 31 publications

Diversifying Neural Text Generation with Part-of-Speech Guided Softmax and Sampling

Diversifying Neural Text Generation with Part-of-Speech Guided Softmax and Sampling

Conversational Information Seeking

Abstractive Summarization Model with Adaptive Sparsemax

Contact Info

Product

Resources

About