Myeongho Jeong scite author profile

Despite the super-human accuracy of recent deep models in NLP tasks, their robustness is reportedly limited due to their reliance on spurious patterns. We thus aim to leverage contrastive learning and counterfactual augmentation for robustness. For augmentation, existing work either requires humans to add counterfactuals to the dataset or machines to automatically matches near-counterfactuals already in the dataset. Unlike existing augmentation is affected by spurious correlations, ours, by synthesizing “a set” of counterfactuals, and making a collective decision on the distribution of predictions on this set, can robustly supervise the causality of each term. Our empirical results show that our approach, by collective decisions, is less sensitive to task model bias of attribution-based synthesis, and thus achieves significant improvements, in diverse dimensions: 1) counterfactual robustness, 2) cross-domain generalization, and 3) generalization from scarce data.

show abstract

Structure-Augmented Keyphrase Generation

Kim¹,

Jeong²,

Choi³

et al. 2021

View full text Add to dashboard Cite

This paper studies the keyphrase generation (KG) task for scenarios where structure plays an important role. For example, a scientific publication consists of a short title and a long body, where the title can be used for de-emphasizing unimportant details in the body. Similarly, for short social media posts (e.g., tweets), scarce context can be augmented from titles, though often missing. Our contribution is generating/augmenting structure then encoding these information, using existing keyphrases of other documents, complementing missing/incomplete titles. Specifically, we first extend the given document with related but absent keyphrases from existing keyphrases, to augment missing contexts (generating structure), and then, build a graph of keyphrases and the given document, to obtain structure-aware representation of the augmented text (encoding structure). Our empirical results validate that our proposed structure augmentation and structure-aware encoding can improve KG for both scenarios, outperforming the state-of-the-art 1 .

show abstract

Conditional Response Augmentation for Dialogue Using Knowledge Distillation

Jeong¹,

Choi²,

Han³

et al. 2020

View full text Add to dashboard Cite

Evaluating the Knowledge Dependency of Questions

Hyeongdon¹,

Yang²,

Yu³

et al. 2022

View full text Add to dashboard Cite

The automatic generation of Multiple Choice Questions (MCQ) has the potential to reduce the time educators spend on student assessment significantly. However, existing evaluation metrics for MCQ generation, such as BLEU, ROUGE, and METEOR, focus on the n-gram based similarity of the generated MCQ to the gold sample in the dataset and disregard their educational value. They fail to evaluate the MCQ's ability to assess the student's knowledge of the corresponding target fact. To tackle this issue, we propose a novel automatic evaluation metric, coined Knowledge Dependent Answerability (KDA), which measures the MCQ's answerability given knowledge of the target fact. Specifically, we first show how to measure KDA based on student responses from a human survey. Then, we propose two automatic evaluation metrics, KDA disc and KDA cont , that approximate KDA by leveraging pre-trained language models to imitate students' problem-solving behavior. Through our human studies, we show that KDA disc and KDA cont have strong correlations with both (1) KDA and (2) usability in an actual classroom setting, labeled by experts. Furthermore, when combined with ngram based similarity metrics, KDA disc and KDA cont are shown to have a strong predictive power for various expert-labeled MCQ quality measures. 1

show abstract

Label and Context Augmentation for Response Selection at DSTC8

Jeong

Choi

Yeo

et al. 2021

IEEE/ACM Trans. Audio Speech Lang. Process.

View full text Add to dashboard Cite

Evaluation of Question Generation Needs More References

Oh¹,

Go²,

Hyeongdon³

et al. 2023

View full text Add to dashboard Cite

Cross Encoding as Augmentation: Towards Effective Educational Text Classification

Lee¹,

Choi²,

Lee³

et al. 2023

View full text Add to dashboard Cite

Towards Zero-Shot Functional Compositionality of Language Models

Yu¹,

Jeong²,

Shin³

et al. 2023

Preprint

View full text Add to dashboard Cite

Large Pre-trained Language Models (PLM) have become the most desirable starting point in the field of NLP, as they have become remarkably good at solving many individual tasks. Despite such success, in this paper, we argue that current paradigms of working with PLMs are neglecting a critical aspect of modeling human intelligence: functional compositionality. Functional compositionality -the ability to compose learned tasks -has been a long-standing challenge in the field of AI (and many other fields) as it is considered one of the hallmarks of human intelligence. An illustrative example of such is cross-lingual summarization, where a bilingual person (English-French) could directly summarize an English document into French sentences without having to translate the English document or summary into French explicitly. We discuss why this matter is an important open problem that requires further attention from the field. Then, we show that current PLMs (e.g., GPT-2 and T5) don't have functional compositionality yet and it is far from human-level generalizability. Finally, we suggest several research directions that could push the field towards zeroshot functional compositionality of language models. 1

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Myeongho Jeong

C2L: Causally Contrastive Learning for Robust Text Classification

Structure-Augmented Keyphrase Generation

Conditional Response Augmentation for Dialogue Using Knowledge Distillation

Evaluating the Knowledge Dependency of Questions

Label and Context Augmentation for Response Selection at DSTC8

Evaluation of Question Generation Needs More References

Cross Encoding as Augmentation: Towards Effective Educational Text Classification

Towards Zero-Shot Functional Compositionality of Language Models

Contact Info

Product

Resources

About