Multi-Task Supervised Pretraining for Neural Domain Adaptation

Meftah, Sara; Semmar, Nasredine; Tahiri, Mohamed-Ayoub; Tamaazousti, Youssef; Essafi, Hassane; Sadat, Fatiha

doi:10.18653/v1/2020.socialnlp-1.8

Cited by 11 publications

(7 citation statements)

References 37 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“… 3 Combining and then jointly training a small dataset of interest with a larger auxiliary dataset often helps the prediction accuracy. 22 , 23 For the auxiliary dataset, we downloaded the publicly available SIIM-ISIC Melanoma Classification Challenge Dataset from 2018 to 2020. 16 , 24 This dataset contains 58,459 images of 9 skin cancer diseases: actinic keratosis, basal cell carcinoma, benign keratosis, dermatofibroma, melanoma, melanocytic nevus, squamous cell carcinoma, vascular lesion, and other unknown skin cancer cases.…”

Section: Methodsmentioning

confidence: 99%

Neural network classifiers for images of genetic conditions with cutaneous manifestations

Duong

Waikel

et al. 2022

Human Genetics and Genomics Advances

View full text Add to dashboard Cite

Summary Neural networks have shown strong potential in research and in healthcare. Mainly due to the need for large datasets, these applications have focused on common medical conditions, where more data are typically available. Leveraging publicly available data, we trained a neural network classifier on images of rare genetic conditions with skin findings. We used approximately 100 images per condition to classify 6 different genetic conditions. We analyzed both preprocessed images that were cropped to show only the skin lesions as well as more complex images showing features such as the entire body segment, the person, and/or the background. The classifier construction process included attribution methods to visualize which pixels were most important for computer-based classification. Our classifier was significantly more accurate than pediatricians or medical geneticists for both types of images and suggests steps for further research involving clinical scenarios and other applications.

show abstract

Section: Methodsmentioning

confidence: 99%

Neural network classifiers for images of genetic conditions with cutaneous manifestations

Duong

Waikel

et al. 2022

Human Genetics and Genomics Advances

View full text Add to dashboard Cite

show abstract

“…Winata et al (2018) weighted losses for language modeling and POS tagging in an MTL setting, finding a lower weight to language modeling yielded a reduction in perplexity in modeling codeswitching between Chinese and English. A multitask supervised pretraining adaption strategy using a hierarchical architecture that learns multiple tasks on a source domain before fine-tuning them on the target was implemented by Meftah et al (2020). By using different weights for the different level tasks, starting with higher weights for lower tasks before incrementally increasing weights to higher level tasks during training, they achieve a noticeable error reduction in POS tagging, dependency parsing, and chunking.…”

Section: Mtl Performancementioning

confidence: 99%

Proceedings of the 17th International Conference on Parsing Technologies and the IWPT 2021 Shared Task on Parsing into Enhanced Universal Dependencies (IWPT 2021)

2021

View full text Add to dashboard Cite

When learned without exploration, local models for structured prediction tasks are subject to exposure bias and cannot be trained without detailed guidance. Active Imitation Learning (AIL), also known in NLP as Dynamic Oracle Learning, is a general technique for working around these issues by allowing the exploration of different outputs at training time.AIL requires oracle feedback: an oracle is any algorithm which can, given a partial candidate solution and gold annotation, find the correct (minimum loss) next output to produce. This paper describes a general finite state technique for deriving oracles. The technique described is also efficient and will greatly expand the tasks for which AIL can be used.

show abstract

“…MuTSPad (Multi-Task Supervised Pretraining and Adaptation) (Meftah et al, 2020) leverages hi-erarchical learning of a multi-task model on highresource domain followed by fine-tuning on multiple tasks on the low-resource target domain.…”

Section: Domainmentioning

confidence: 99%

On the Universality of Deep Contextual Language Models

Bhatt¹,

Goyal²,

Dandapat³

et al. 2021

Preprint

View full text Add to dashboard Cite

Deep Contextual Language Models (LMs) like ELMO, BERT, and their successors dominate the landscape of Natural Language Processing due to their ability to scale across multiple tasks rapidly by pre-training a single model, followed by task-specific fine-tuning. Furthermore, multilingual versions of such models like XLM-R and mBERT have given promising results in zero-shot cross-lingual transfer, potentially enabling NLP applications in many under-served and under-resourced languages. Due to this initial success, pre-trained models are being used as 'Universal Language Models' as the starting point across diverse tasks, domains, and languages. This work explores the notion of 'Universality' by identifying seven dimensions across which a universal model should be able to scale, that is, perform equally well or reasonably well, to be useful across diverse settings. We outline the current theoretical and empirical results that support model performance across these dimensions, along with extensions that may help address some of their current limitations. Through this survey, we lay the foundation for understanding the capabilities and limitations of massive contextual language models and help discern research gaps and directions for future work to make these LMs inclusive and fair to diverse applications, users, and linguistic phenomena.1 Throughout the rest of the paper -"these models", "LMs", "general domain LMs", "contextual LMs", "universal LMs" and all such terms refers to models including but not limited to ELMo, BERT, RoBERTa, GPT their variants, successors and multilingual versions

show abstract

Multi-Task Supervised Pretraining for Neural Domain Adaptation

Cited by 11 publications

References 37 publications

Neural network classifiers for images of genetic conditions with cutaneous manifestations

Neural network classifiers for images of genetic conditions with cutaneous manifestations

Proceedings of the 17th International Conference on Parsing Technologies and the IWPT 2021 Shared Task on Parsing into Enhanced Universal Dependencies (IWPT 2021)

On the Universality of Deep Contextual Language Models

Contact Info

Product

Resources

About