2021
DOI: 10.1007/s00521-021-06456-y
|View full text |Cite
|
Sign up to set email alerts
|

Which scaling rule applies to large artificial neural networks

Abstract: Experience shows that cooperating and communicating computing systems, comprising segregated single processors, have severe performance limitations, which cannot be explained using von Neumann’s classic computing paradigm. In his classic “First Draft,” he warned that using a “too fast processor” vitiates his simple “procedure” (but not his computing model!); furthermore, that using the classic computing paradigm for imitating neuronal operations is unsound. Amdahl added that large machines, comprising many pro… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
18
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
1
1

Relationship

1
5

Authors

Journals

citations
Cited by 9 publications
(19 citation statements)
references
References 45 publications
0
18
0
Order By: Relevance
“…Their operating principle undergoes the general distributed processing principles. As discussed in [35], they can do valuable work at a small number of cores ('toy level') and can be useful embedded components in a general-purpose processor, but have severe performance limitations at large scale systems. They are sensitive to the synchronization issues discussed here, primarily if they use feedback and recurrency [16].…”
Section: Artificial Neural Networkmentioning
confidence: 99%
See 4 more Smart Citations
“…Their operating principle undergoes the general distributed processing principles. As discussed in [35], they can do valuable work at a small number of cores ('toy level') and can be useful embedded components in a general-purpose processor, but have severe performance limitations at large scale systems. They are sensitive to the synchronization issues discussed here, primarily if they use feedback and recurrency [16].…”
Section: Artificial Neural Networkmentioning
confidence: 99%
“…In the case of AI-type workload, the performance with half-precision and double precision operands differ only marginally for vast systems. For details, see [35,42].…”
Section: Half-length Operands Vs Double-length Onesmentioning
confidence: 99%
See 3 more Smart Citations