Multi-Task Learning with Deep Neural Networks: A Survey

Crawshaw, Michael

doi:10.48550/arxiv.2009.09796

Cited by 118 publications

(144 citation statements)

References 116 publications

Supporting

Mentioning

118

Contrasting

Order By: Relevance

“…expert training stage, is to learn representation for each specialized task type such as image classification, object detection, and semantic segmentation. This design naturally mitigates the learning difficulty caused by task conflicts [12] in multi-task learning setups. The representation learned for each task type performs better than simple ImageNet pretraining when tested on other tasks of the same type [29] strengthened knowledge specifically for that task type.…”

Section: Easy Extensibility and Great Generalizabilitymentioning

confidence: 99%

INTERN: A New Learning Paradigm Towards General Vision

Shao¹,

Chen²,

Li³

et al. 2021

Preprint

View full text Add to dashboard Cite

Enormous waves of technological innovations over the past several years, marked by the advances in AI technologies, are profoundly reshaping the industry and the society. However, down the road, a key challenge awaits us, that is, our capability of meeting rapidly-growing scenario-specific demands is severely limited by the cost of acquiring a commensurate amount of training data. This difficult situation is in essence due to limitations of the mainstream learning paradigm: we need to train a new model for each new scenario, based on a large quantity of well-annotated data and commonly from scratch. In tackling this fundamental problem, we move beyond and develop a new learning paradigm named INTERN. By learning with supervisory signals from multiple sources in multiple stages, the model being trained will develop strong generalizability. We evaluate our model on 26 well-known datasets that cover four categories of tasks in computer vision. In most cases, our models, adapted with only 10% of the training data in the target domain, outperform the counterparts trained with the full set of data, often by a significant margin. This is an important step towards a promising prospect where such a model with general vision capability can dramatically reduce our reliance on data, thus expediting the adoption of AI technologies. Furthermore, revolving around our new paradigm, we also introduce a new data system, a new architecture, and a new benchmark, which, together, form a general vision ecosystem to support its future development in an open and inclusive manner.

show abstract

Section: Easy Extensibility and Great Generalizabilitymentioning

confidence: 99%

INTERN: A New Learning Paradigm Towards General Vision

Shao¹,

Chen²,

Li³

et al. 2021

Preprint

View full text Add to dashboard Cite

show abstract

“…Therefore, in order to create an effective MTL architecture it is important to analyze how to combine the shared modules (layers) and task specific modules and what portion of model's parameters will be shared between tasks. In conventional MTL, the parameter sharing approach is classified as [18]-…”

Section: Multi-task Learning (Mtl)mentioning

confidence: 99%

Sharing to learn and learning to share -- Fitting together Meta-Learning, Multi-Task Learning, and Transfer Learning: A meta review

Upadhyay¹,

Phlypo²,

Saini³

et al. 2021

Preprint

View full text Add to dashboard Cite

Integrating knowledge across different domains is an essential feature of human learning. Learning paradigms like transfer learning, meta learning, and multi-task learning reflect the human learning process by exploiting the prior knowledge for new tasks, encouraging faster learning and good generalization for new tasks. This article gives a detailed view of these learning paradigms along with a comparative analysis. The weakness of a learning algorithm turns out to be the strength of another, and thereby merging them is a prevalent trait in the literature. This work delivers a literature review of the articles, which fuses two algorithms to accomplish multiple tasks. A global generic learning network, an ensemble of meta learning, transfer learning, and multi-task learning, is also introduced here, along with some open research questions and directions for future research.

show abstract

“…Although unified QG encoding enables models to process question generation across formats, how to effectively and efficiently train a QG model across multiple datasets is still challenging. A straightforward solution is to use multitask learning [16], but it needs to retrain the QG model using all the historical data whenever a new dataset is available. As a result, it is not scalable due to the linearly increasing computation and storage [6] costs.…”

Section: Introductionmentioning

confidence: 99%

Unified Question Generation with Continual Lifelong Learning

Yuan,

Yin,

et al. 2022

Preprint

View full text Add to dashboard Cite

Question Generation (QG), as a challenging Natural Language Processing task, aims at generating questions based on given answers and context. Existing QG methods mainly focus on building or training models for specific QG datasets. These works are subject to two major limitations: (1) They are dedicated to specific QG formats (e.g., answer-extraction or multi-choice QG), therefore, if we want to address a new format of QG, a re-design of the QG model is required. (2) Optimal performance is only achieved on the dataset they were just trained on. As a result, we have to train and keep various QG models for different QG datasets, which is resource-intensive and ungeneralizable.To solve the problems, we propose a model named Unified-QG based on lifelong learning techniques, which can continually learn QG tasks across different datasets and formats. Specifically, we first build a format-convert encoding to transform different kinds of QG formats into a unified representation. Then, a method named STRIDER (SimilariT y RegularI zed Difficult Example Replay) is built to alleviate catastrophic forgetting in continual QG learning. Extensive experiments were conducted on 8 QG datasets across 4 QG formats (answer-extraction, answer-abstraction, multi-choice, and boolean QG) to demonstrate the effectiveness of our approach. Experimental results demonstrate that our Unified-QG can effectively and continually adapt to QG tasks when datasets and formats vary. In addition, we verify the ability of a single trained Unified-QG model in improving 8 Question Answering (QA) systems' performance through generating synthetic QA data.

show abstract

Multi-Task Learning with Deep Neural Networks: A Survey

Cited by 118 publications

References 116 publications

INTERN: A New Learning Paradigm Towards General Vision

INTERN: A New Learning Paradigm Towards General Vision

Sharing to learn and learning to share -- Fitting together Meta-Learning, Multi-Task Learning, and Transfer Learning: A meta review

Unified Question Generation with Continual Lifelong Learning

Contact Info

Product

Resources

About