2017
DOI: 10.48550/arxiv.1712.05889
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Ray: A Distributed Framework for Emerging AI Applications

Abstract: The next generation of AI applications will continuously interact with the environment and learn from these interactions. These applications impose new and demanding systems requirements, both in terms of performance and flexibility. In this paper, we consider these requirements and present Ray-a distributed system to address them. Ray implements a unified interface that can express both task-parallel and actor-based computations, supported by a single dynamic execution engine. To meet the performance requirem… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
54
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
4
4
1

Relationship

2
7

Authors

Journals

citations
Cited by 54 publications
(58 citation statements)
references
References 23 publications
0
54
0
Order By: Relevance
“…Line counts include lines used for logging and debugging functionality. We implemented Tune using the Ray (Moritz et al (2017)) framework, which as noted earlier provides the actor abstraction used to run trials in Tune. In contrast to popular distributed frameworks such as Spark (Zaharia et al (2012)), or MPI (Gabriel et al (2004)), Ray offers a more flexible programming model.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…Line counts include lines used for logging and debugging functionality. We implemented Tune using the Ray (Moritz et al (2017)) framework, which as noted earlier provides the actor abstraction used to run trials in Tune. In contrast to popular distributed frameworks such as Spark (Zaharia et al (2012)), or MPI (Gabriel et al (2004)), Ray offers a more flexible programming model.…”
Section: Methodsmentioning
confidence: 99%
“…To meet these requirements, we propose the Tune user-facing and scheduling APIs (Section 4) and implement it on the Ray distributed computing framework (Moritz et al (2017)). The Ray framework provides the underlying distributed execution and resource management.…”
Section: Requirements For Api Generalitymentioning
confidence: 99%
“…To handle task distribution, pyscreener relies on the ray library [13] for distributed computation. For multithreaded docking software, pyscreener allows a user to specify how many CPU cores to run each individual docking simulation over, running as many docking simulations in parallel as possible for a given number of total CPU cores in the ray cluster.…”
Section: Implementation and Performancementioning
confidence: 99%
“…Auto-scaling and Fault Tolerance Efforts that add fault tolerance to ScaLAPACK has so far demonstrated to incur significant performance overhead [11]. For almost all BSP and dataflow systems [30,24,29], recomputation is required to restore stateful workers or datasets that have not been checkpointed. MadLINQ [34] also uses dependency tracking to minimize recomputation for its pipelined execution.…”
Section: Related Workmentioning
confidence: 99%