Evolutionary architecture search for deep multitask networks

Liang, Jason; Meyerson, Elliot; Miikkulainen, Risto

doi:10.1145/3205455.3205489

Cited by 88 publications

(66 citation statements)

References 30 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In addition to the three goals of evolutionary AutoML demonstrated in this paper, a fourth one is to take advantage of multiple related datasets. As shown in prior work [30], even when there is little data to train a DNN in a particular task, other tasks in a multitask setting can help achieve good performance. Evolutionary AutoML thus forms a framework for utilizing DNNs in domains that otherwise would be impractical due to lack of data.…”

Section: Discussionmentioning

confidence: 95%

Evolutionary neural AutoML for deep learning

Liang

Meyerson

Hodjat

et al. 2019

Proceedings of the Genetic and Evolutionary Computation Conference

Self Cite

View full text Add to dashboard Cite

Deep neural networks (DNNs) have produced state-of-the-art results in many benchmarks and problem domains. However, the success of DNNs depends on the proper configuration of its architecture and hyperparameters. Such a configuration is difficult and as a result, DNNs are often not used to their full potential. In addition, DNNs in commercial applications often need to satisfy real-world design constraints such as size or number of parameters. To make configuration easier, automatic machine learning (AutoML) systems for deep learning have been developed, focusing mostly on optimization of hyperparameters.This paper takes AutoML a step further. It introduces an evolutionary AutoML framework called LEAF that not only optimizes hyperparameters but also network architectures and the size of the network. LEAF makes use of both state-of-the-art evolutionary algorithms (EAs) and distributed computing frameworks. Experimental results on medical image classification and natural language analysis show that the framework can be used to achieve state-of-the-art performance. In particular, LEAF demonstrates that architecture optimization provides a significant boost over hyperparameter optimization, and that networks can be minimized at the same time with little drop in performance. LEAF therefore forms a foundation for democratizing and improving AI, as well as making AI practical in future applications.

show abstract

Section: Discussionmentioning

confidence: 95%

Evolutionary neural AutoML for deep learning

Liang

Meyerson

Hodjat

et al. 2019

Proceedings of the Genetic and Evolutionary Computation Conference

Self Cite

View full text Add to dashboard Cite

show abstract

“…SFGs probabilistically defines the grouping of kernels and thus the connectivity of features in a CNNs. We use variational inference to approximate the distribution (ii) (i) (iii) (iv) Our method can be considered as a probabilistic form of multi-task architecture learning [34], as the learned posterior embodies the optimal MTL architecture given the data.…”

Section: Discussionmentioning

confidence: 99%

Stochastic Filter Groups for Multi-Task CNNs: Learning Specialist and Generalist Convolution Kernels

Bragman

Tanno

Ourselin

et al. 2019

2019 IEEE/CVF International Conference on Computer Vision (ICCV)

View full text Add to dashboard Cite

The performance of multi-task learning in Convolutional Neural Networks (CNNs) hinges on the design of feature sharing between tasks within the architecture. The number of possible sharing patterns are combinatorial in the depth of the network and the number of tasks, and thus hand-crafting an architecture, purely based on the human intuitions of task relationships can be time-consuming and suboptimal. In this paper, we present a probabilistic approach to learning task-specific and shared representations in CNNs for multi-task learning. Specifically, we propose "stochastic filter groups" (SFG), a mechanism to assign convolution kernels in each layer to "specialist" or "generalist" groups, which are specific to or shared across different tasks, respectively. The SFG modules determine the connectivity between layers and the structures of task-specific and shared representations in the network. We employ variational inference to learn the posterior distribution over the possible grouping of kernels and network parameters. Experiments demonstrate that the proposed method generalises across multiple tasks and shows improved performance over baseline methods.

show abstract

“…A recent step in this direction is made by Liu et al [19] who propose an adaptive MTL model that structurally groups tasks together. Evolutionary algorithms have also been shown to capture task relatedness and create sharing structures [16]. A less architectural solution is proposed by Yang et al [40] who use a factorized space representation to initialize and learn intertask sharing structures at each layer in an MTL model.…”

Section: Related Workmentioning

confidence: 99%

“…This search duration grows proportionally with the number of tasks and parameters present in the model's structure. Previous works in both MTL and STL rely on evolutionary algorithms [16] or factorization techniques [40] to discover their optimal way of learning, however this takes time and prolongs the training process. In our work, inspired by the efficiency of Random Search [3] we enforce a structured random solution to this problem by regulating the per-task data-flow in our models.…”

Section: Introductionmentioning

confidence: 99%

Many Task Learning With Task Routing

Strezoski

Noord

Worring

2019

2019 IEEE/CVF International Conference on Computer Vision (ICCV)

View full text Add to dashboard Cite

Typical multi-task learning (MTL) methods rely on architectural adjustments and a large trainable parameter set to jointly optimize over several tasks. However, when the number of tasks increases so do the complexity of the architectural adjustments and resource requirements. In this paper, we introduce a method which applies a conditional feature-wise transformation over the convolutional activations that enables a model to successfully perform a large number of tasks. To distinguish from regular MTL, we introduce Many Task Learning (MaTL) as a special case of MTL where more than 20 tasks are performed by a single model. Our method dubbed Task Routing (TR) is encapsulated in a layer we call the Task Routing Layer (TRL), which applied in an MaTL scenario successfully fits hundreds of classification tasks in one model. We evaluate our method on 5 datasets against strong baselines and state-of-the-art approaches.

show abstract

Evolutionary architecture search for deep multitask networks

Cited by 88 publications

References 30 publications

Evolutionary neural AutoML for deep learning

Evolutionary neural AutoML for deep learning

Stochastic Filter Groups for Multi-Task CNNs: Learning Specialist and Generalist Convolution Kernels

Many Task Learning With Task Routing

Contact Info

Product

Resources

About