GDumb: A Simple Approach that Questions Our Progress in Continual Learning

Prabhu, Ameya; Torr, Philip H. S.; Dokania, Puneet K.

doi:10.1007/978-3-030-58536-5_31

Cited by 313 publications

(317 citation statements)

References 71 publications

(67 reference statements)

Supporting

Mentioning

283

Contrasting

Order By: Relevance

“…The latter is considered a strong CL baseline (Maltoni and Lomonaco, 2019) and thus we use this approach in this study. We also compare against a variation of the replay-based GDumb baseline (Prabhu et al, 2020). GDumb collects examples into a memory buffer with a limited budget size k, balancing the distribution over labels by greedily sampling underrepresented label types and ejecting over-sampled label types.…”

Section: Continuous Learningmentioning

confidence: 99%

Towards Realistic Single-Task Continuous Learning Research for NER

Justin¹,

Merhav²,

He³

et al. 2021

Findings of the Association for Computational Linguistics: EMNLP 2021

View full text Add to dashboard Cite

There is an increasing interest in continuous learning (CL), as data privacy is becoming a priority for real-world machine learning applications. Meanwhile, there is still a lack of academic NLP benchmarks that are applicable for realistic CL settings, which is a major challenge for the advancement of the field. In this paper we discuss some of the unrealistic data characteristics of public datasets, study the challenges of realistic single-task continuous learning as well as the effectiveness of data rehearsal as a way to mitigate accuracy loss. We construct a CL NER dataset from an existing publicly available dataset and release it along with the code to the research community 1 .

show abstract

Section: Continuous Learningmentioning

confidence: 99%

Towards Realistic Single-Task Continuous Learning Research for NER

Justin¹,

Merhav²,

He³

et al. 2021

Findings of the Association for Computational Linguistics: EMNLP 2021

View full text Add to dashboard Cite

show abstract

“…The approach in [28] tries to retrieve only the samples that are most conflicted. At the same time, [29] proposes to improve performance by greedily storing samples in memory and retraining on these stored samples while testing. The work in [30] proposes an expansion-based approach for task-free continual learning built upon the Bayesian nonparametric.…”

Section: A Incremental Learningmentioning

confidence: 99%

“…Comparison is also performed with [27], which proposes an intermediate expert to adapt the target model to the new task and [28], which retrieves the samples that are frequently conflicted. We also include our comparison with [29], which greedily stores samples in memory and trains a model from scratch and uses these samples during testing. Finally, we compare with [30], which increases the number of neural network experts under the Bayesian non-parametric framework We carried out experiments for comparing the results obtained from the proposed approach with different algorithms: 1.…”

Section: Comparison With Existing Literaturementioning

confidence: 99%

See 1 more Smart Citation

CILEA-NET: Curriculum-Based Incremental Learning Framework for Remote Sensing Image Classification

Bhat

Banerjee

Chaudhuri

et al. 2021

IEEE J. Sel. Top. Appl. Earth Observations Remote Sensing

View full text Add to dashboard Cite

In this paper, we address class incremental learning (IL) in remote sensing image analysis. Since remote sensing images are acquired continuously over time by Earth's Observation sensors, the land-cover/land-use classes on the ground are likely to be found in a gradational manner. This process restricts the deployment of stand-alone classification approaches, which are trained for all the classes together in one iteration. Therefore, for every new set of categories discovered, the entire network consisting of old and new classes requires retraining. This procedure is often impractical, considering vast volumes of data, limited resources, and the complexity of learning models. In this respect, we propose a ConvNet based framework (called CILEA-NET, Curriculum-based Incremental LEArning Framework for Remote Sensing Image Classification) to efficiently resolve the difficulties associated with incremental learning paradigm. The framework includes new classes in the already trained model to avoid catastrophic forgetting for the old while ensuring improved generalization for the newly added classes. To manage the IL's stability-plasticity dilemma, we introduce a novel curriculum learning-based approach where the order of the new classes is devised based on their similarity to the already trained classes. We then perform the training in that given order. We notice that the curriculum learning setup distinctly enhances the training time for the new classes. Experimental results on several optical datasets: PatternNet and NWPU-RESISC45, and a hyperspectral dataset: Indian Pines, validate the robustness of our technique.

show abstract

“…Our work follows this transfer learning paradigm but our main focus is to investigate the regression phenomenon when updating backbone pre-trained models. Another related stream of research is lifelong learning (Lopez-Paz and Ranzato, 2017;Yoon et al, 2018;Delange et al, 2021;Sun et al, 2019;Chuang et al, 2020), incremental learning (Rebuffi et al, 2017Chaudhry et al, 2018;Prabhu et al, 2020), or concept drifting (Schlimmer and Granger, 1986;Tsymbal, 2004;Klinkenberg, 2005;Žliobaitė I., 2016) which aims to accumulate knowledge learned either in previous tasks or from data with changing distribution. The model update regression problem differs in that models are trained on the same task and dataset, but we update from one model to another.…”

Section: Transfer Learning Lifelong Learning and Concept Driftingmentioning

confidence: 99%

Regression Bugs Are In Your Model! Measuring, Reducing and Analyzing Regressions In NLP Model Updates

Xie¹,

Lai²,

Xiong³

et al. 2021

Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Confer

View full text Add to dashboard Cite

Behavior of deep neural networks can be inconsistent between different versions. Regressions 1 during model update are a common cause of concern that often over-weigh the benefits in accuracy or efficiency gain. This work focuses on quantifying, reducing and analyzing regression errors in the NLP model updates. Using negative flip rate as regression measure, we show that regression has a prevalent presence across tasks in the GLUE benchmark. We formulate the regression-free model updates into a constrained optimization problem, and further reduce it into a relaxed form which can be approximately optimized through knowledge distillation training method. We empirically analyze how model ensemble reduces regression. Finally, we conduct CHECKLIST behavioral testing to understand the distribution of regressions across linguistic phenomena, and the efficacy of ensemble and distillation methods. * * Work done while at Amazon AWS AI. 1 Here regression refers to bugs in software testing instead of the statistical estimation method.

show abstract

GDumb: A Simple Approach that Questions Our Progress in Continual Learning

Cited by 313 publications

References 71 publications

Towards Realistic Single-Task Continuous Learning Research for NER

Towards Realistic Single-Task Continuous Learning Research for NER

CILEA-NET: Curriculum-Based Incremental Learning Framework for Remote Sensing Image Classification

Regression Bugs Are In Your Model! Measuring, Reducing and Analyzing Regressions In NLP Model Updates

Contact Info

Product

Resources

About