2019 IEEE 26th International Conference on High Performance Computing, Data, and Analytics (HiPC) 2019
DOI: 10.1109/hipc.2019.00032
|View full text |Cite
|
Sign up to set email alerts
|

On Linear Learning with Manycore Processors

Abstract: A new generation of manycore processors is on the rise that offers dozens and more cores on a chip and, in a sense, fuses host processor and accelerator. In this paper we target the efficient training of generalized linear models on these machines. We propose a novel approach for achieving parallelism which we call Heterogeneous Tasks on Homogeneous Cores (HTHC). It divides the problem into multiple fundamentally different tasks, which themselves are parallelized. For evaluation, we design a detailed, architec… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2020
2020
2020
2020

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(1 citation statement)
references
References 20 publications
(20 reference statements)
0
1
0
Order By: Relevance
“…Shared-memory architectures span compute resources such as many-core CPUs and hardware accelerators, which have many compute nodes that share the same physical memory space (Figure 1, left). Even though the nodes share the same physical memory, non-uniform memory access (NUMA) designs and deep cache hierarchies in today's architectures invalidate the assumption that nodes have immediate access to a memory region (see, e.g., [20][21][22][23][24][25] that revisit algorithms and take this issue into account). In shared-memory architectures, all the serialization primitives are in the same physical space.…”
Section: Architecturesmentioning
confidence: 99%
“…Shared-memory architectures span compute resources such as many-core CPUs and hardware accelerators, which have many compute nodes that share the same physical memory space (Figure 1, left). Even though the nodes share the same physical memory, non-uniform memory access (NUMA) designs and deep cache hierarchies in today's architectures invalidate the assumption that nodes have immediate access to a memory region (see, e.g., [20][21][22][23][24][25] that revisit algorithms and take this issue into account). In shared-memory architectures, all the serialization primitives are in the same physical space.…”
Section: Architecturesmentioning
confidence: 99%