Substructure Substitution: Structured Data Augmentation for NLP

Shi, Haoyue; Livescu, Karen; Gimpel, Kevin

doi:10.48550/arxiv.2101.00411

Cited by 4 publications

(5 citation statements)

References 57 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…This is known as structure augmentation (also known as substructure augmentation); in our case, this form of data augmentation divides the research data into two data tree structures (i.e., aerospace and aviation). This structural data augmentation allows us to perform comparative NLP tasks such as text parsing, textual classification, and comparative token analysis (Shi et al, 2021).…”

Section: Methodsmentioning

confidence: 99%

Distinguishing the Job Market Across Aerospace and Aviation: A Natural Language Processing Approach

Walden,

Pritchard

2023

CARI

View full text Add to dashboard Cite

This study dives into the intricate landscape of the aerospace and aviation job market. While these two markets are often conflated as being similar, if not the same, we propose that the differences are important to recent graduates of educational institutions and career programs. The research utilized a custom-written Natural Language Processing (NLP) software tool to distinguish the differences in 6,000 job offerings between the two industries with the hope of illuminating nuances to those in positions involved in placing professionals into careers. This research not only reveals the dynamic employment landscape of aerospace and aviation but also highlights the power of NLP in more clearly discerning emerging trends in job data.

show abstract

Section: Methodsmentioning

confidence: 99%

Distinguishing the Job Market Across Aerospace and Aviation: A Natural Language Processing Approach

Walden,

Pritchard

2023

CARI

View full text Add to dashboard Cite

show abstract

Section: Methodsmentioning

confidence: 99%

Untitled

2023

CARI

View full text Add to dashboard Cite

The University Aviation Association publishes the Collegiate Aviation Review International throughout each calendar year. Papers published in each volume and issue are selected from submissions that were subjected to a double-blind peer review process.The University Aviation Association is the only professional organization representing all levels of the non-engineering/technology element in collegiate aviation education and research. Working through its officers, trustees, committees, and professional staff, the University Aviation Association plays a vital role in collegiate aviation and in the aerospace industry. The University Aviation Association accomplishes its goals through a number of objectives:

show abstract

“…proposed a multi-task view of DA. SUB 2 (Shi et al, 2021) generates new examples by substituting substructures via constituency parse trees. Although these methods are easy to implement, they do not consider controlling data quality and diversity.…”

Section: Rule-based Methodsmentioning

confidence: 99%

EPiDA: An Easy Plug-in Data Augmentation Framework for High Performance Text Classification

Zhao¹,

Zhang²,

Xu³

et al. 2022

Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Langua

View full text Add to dashboard Cite

Recent works have empirically shown the effectiveness of data augmentation (DA) for NLP tasks, especially for those suffering from data scarcity. Intuitively, given the size of generated data, their diversity and quality are crucial to the performance of targeted tasks. However, to the best of our knowledge, most existing methods consider only either the diversity or the quality of augmented data, thus cannot fully tap the potential of DA for NLP. In this paper, we present an easy and plug-in data augmentation framework EPiDA to support effective text classification. EPiDA employs two mechanisms: relative entropy maximization (REM) and conditional entropy minimization (CEM) to control data generation, where REM is designed to enhance the diversity of augmented data while CEM is exploited to ensure their semantic consistency. EPiDA can support efficient and continuous data generation for effective classifier training. Extensive experiments show that EP-iDA outperforms existing SOTA methods in most cases, though not using any agent network or pre-trained generation network, and it works well with various DA algorithms and classification models. Code is available at https: //github.com/zhaominyiz/EPiDA.

show abstract

Substructure Substitution: Structured Data Augmentation for NLP

Cited by 4 publications

References 57 publications

Distinguishing the Job Market Across Aerospace and Aviation: A Natural Language Processing Approach

Distinguishing the Job Market Across Aerospace and Aviation: A Natural Language Processing Approach

Untitled

EPiDA: An Easy Plug-in Data Augmentation Framework for High Performance Text Classification

Contact Info

Product

Resources

About