2017
DOI: 10.48550/arxiv.1711.01239
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Routing Networks: Adaptive Selection of Non-linear Functions for Multi-Task Learning

Abstract: Multi-task learning (MTL) with neural networks leverages commonalities in tasks to improve performance, but often suffers from task interference which reduces the benefits of transfer. To address this issue we introduce the routing network paradigm, a novel neural network and training algorithm. A routing network is a kind of self-organizing neural network consisting of two components: a router and a set of one or more function blocks. A function block may be any neural network -for example a fully-connected o… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
54
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
8
2

Relationship

0
10

Authors

Journals

citations
Cited by 38 publications
(54 citation statements)
references
References 12 publications
0
54
0
Order By: Relevance
“…To grow the number of model parameters without proportionally increasing the computational cost, conditional computation [5,15,12] only activates some relevant parts of the model in an input-dependent fashion, like in decision trees [7]. In deep learning, the activation of portions of the model can use stochastic neurons [6] or reinforcement learning [4,17,53].…”
Section: Related Workmentioning
confidence: 99%
“…To grow the number of model parameters without proportionally increasing the computational cost, conditional computation [5,15,12] only activates some relevant parts of the model in an input-dependent fashion, like in decision trees [7]. In deep learning, the activation of portions of the model can use stochastic neurons [6] or reinforcement learning [4,17,53].…”
Section: Related Workmentioning
confidence: 99%
“…BASELayers (Lewis et al, 2021) circumvents this problem by treating the routing mechanism as a linear expert-to-task assignment problem, without the need of auxiliary loss. Routing networks (Rosenbaum et al, 2017) learn better task representations by clustering and disentangling parameters conditioned on input.…”
Section: Related Workmentioning
confidence: 99%
“…Multi-task Learning Due to its benefit with regards to data and computational efficiency, multi-task learning (MTL) has broad applications in vision, language, and robotics [11,28,22,44,38]. A number of MTL-friendly architectures have been proposed using task-specific modules [25,11], attentionbased mechanisms [21] or activating different paths along the deep networks to tackle MTL [27,40].…”
Section: Related Workmentioning
confidence: 99%