Rabin Banjade scite author profile

Data augmentation is an important component in the robustness evaluation of models in natural language processing (NLP) and in enhancing the diversity of the data they are trained on. In this paper, we present NL-Augmenter, a new participatory Pythonbased natural language augmentation framework which supports the creation of both transformations (modifications to the data) and filters (data splits according to specific features). We describe the framework and an initial set of 117 transformations and 23 filters for a variety of natural language tasks. We demonstrate the efficacy of NL-Augmenter by using several of its tranformations to analyze the robustness of popular natural language models. The infrastructure, datacards and robutstness analysis results are available publicly on the NL-Augmenter repository (https://github. com/GEM-benchmark/NL-Augmenter).

show abstract

A Case of Diffuse Alveolar Hemorrhage With COVID-19 Vaccination

Sharma¹,

Upadhyay²,

Banjade³

et al. 2022

View full text Add to dashboard Cite

With the growing rates of vaccination against coronavirus disease 2019 (COVID-19) across the globe, rare side effects have been increasingly noticed on a post-marketing basis. Cases of myocarditis and pericarditis have been reported in the literature following COVID messenger RNA (mRNA) vaccination. However, diffuse alveolar hemorrhage (DAH) following vaccination has not been reported. DAH is a life-threatening clinicopathological entity characterized by bleeding into the alveolar space from pulmonary microvasculature. It presents a diagnostic challenge in the setting of acute respiratory failure, requiring prompt suspicion and workup.We report a case of a 59-year-old male with a recent COVID-19 infection who presented with DAH within eight hours of the first dose of mRNA vaccination (Moderna, Cambridge, MA). Bronchial alveolar lavage was performed, along with imaging of the chest, to confirm the diagnosis. Immunological workup with rheumatoid factor, anti-citrullinated peptide, anti-neutrophil cytoplasmic antibodies (P-ANCA and C-ANCA), anti-glomerular basement antibodies, Anti-double-stranded DNA, C3 and C4 complement levels, and cryoglobulin were all negative. Infectious workup with cultures and PCR from bronchial lavage was also negative. In the absence of any other causes, the etiology was likely deemed to be vaccine-induced DAH. Herein, we also discuss the possible mechanism of vaccine-related DAH and emphasize the need for further studies on vaccine-related adverse events.

show abstract

NL-Augmenter 🦎 → 🐍 A Framework for Task-Sensitive Natural Language Augmentation

Dhole

Gangal²,

Gehrmann³

et al. 2023

NEJLT

View full text Add to dashboard Cite

Data augmentation is an important method for evaluating the robustness of and enhancing the diversity of training data for natural language processing (NLP) models. In this paper, we present NL-Augmenter, a new participatory Python-based natural language (NL) augmentation framework which supports the creation of transformations (modifications to the data) and filters (data splits according to specific features). We describe the framework and an initial set of 117 transformations and 23 filters for a variety of NL tasks annotated with noisy descriptive tags. The transformations incorporate noise, intentional and accidental human mistakes, socio-linguistic variation, semantically-valid style, syntax changes, as well as artificial constructs that are unambiguous to humans. We demonstrate the efficacy of NL-Augmenter by using its transformations to analyze the robustness of popular language models. We find different models to be differently challenged on different tasks, with quasi-systematic score decreases. The infrastructure, datacards, and robustness evaluation results are publicly available on GitHub for the benefit of researchers working on paraphrase generation, robustness analysis, and low-resource NLP. El aumento de datos es un método importante para evaluar la solidez y mejorar la diversidad del entrenamiento datos para modelos de procesamiento de lenguaje natural (NLP). इस लेख में, हम एनएल-ऑगमेंटर का प्रस्ताव करते हैं - एक नया भागी- दारी पूर्वक, पायथन में बनाया गया, लैंग्वेज (एनएल) ऑग्मेंटेशन फ्रेमवर्क जो ट्रांसफॉर्मेशन (डेटा में बदलाव करना) और फीलटर (फीचर्स के अनुसार डेटा का भाग करना) के नीरमान का समर्थन करता है।. 我们描述了NL-Augmenter框架及其初步包含的117种转换和23个过滤器，并大致标注分类了一系列可适配的自然语言任务. این دگرگونی ها شامل نویز، اشتباهات عمدی و تصادفی انسانی، تنوع اجتماعی-زبانی، سبک معنایی معتبر، تغییرات نحوی و همچنین ساختارهای مصنوعی است که برای انسان ها مبهم است. NL-Augmenterpa allin kaynintam qawachiyku, tikrakuyninku- nata servichikuspayku, chaywanmi qawariyku modelos de lenguaje popular nisqapa allin takyasqa kayninta. Kami menemukan model yang berbeda ditantang secara berbeda pada tugas yang berbeda, dengan penurunan skor kuasi-sistematis. Infrastruktur, kartu data, dan hasil evaluasi ketahanan dipublikasikan tersedia secara gratis di GitHub untuk kepentingan para peneliti yang mengerjakan pembuatan parafrase, analisis ketahanan, dan NLP sumber daya rendah.

show abstract

Domain Model Discovery from Textbooks for Computer Programming Intelligent Tutors

Banjade

Oli

Tamang³

et al. 2021

FLAIRS

View full text Add to dashboard Cite

We present a novel approach to intro-to-programming domain model discovery from textbooks using an over-generation and ranking strategy. We first extract candidate key phrases from each chapter in a Computer Science textbook focusing on intro-to-programming and then rank those concepts according to a number of metrics such as the standard tf-idf weight used in information retrieval and metrics produced by other text ranking algorithms. Specifically, we conduct our work in the context of developing an intelligent tutoring system for source code comprehension for which a specification of the key programming concepts is needed - the system monitors students' performance on those concepts and scaffolds their learning process until they show mastery of the concepts. Our experiments with programming concept instruction from Java textbooks indicate that the statistical methods such as KP Miner method are quite competitive compared to other more sophisticated methods. Automated discovery of domain models will lead to more scalable Intelligent Tutoring Systems (ITSs) across topics and domains, which is a major challenge that needs to be addressed if ITSs are to be widely used by millions of learners across many domains.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Rabin Banjade

NL-Augmenter: A Framework for Task-Sensitive Natural Language Augmentation

A Case of Diffuse Alveolar Hemorrhage With COVID-19 Vaccination

NL-Augmenter 🦎 → 🐍 A Framework for Task-Sensitive Natural Language Augmentation

Domain Model Discovery from Textbooks for Computer Programming Intelligent Tutors

Contact Info

Product

Resources

About