python - 使用ray并行化模拟器python
问题描述
我是 ray 新手,我正在尝试并行化我开发的模拟器。这是我的模拟器的一个例子,显然它更复杂。
import some_library
import sim_library_with_global_object
class Model(object):
def __init__(self,init_vals):
#initialize object using some of the global_object from sim_library.
#the Model object have it's own variables not global
def do_step(self,time):
#calculate Model step using the global_object from sim_library
#edit the Model variables with respect to the step
class ManyModel(object):
def init(self):
self.models=[]
def add_model(self,init_vals):
model = Model(init_vals)
self.model.append(model)
def step(self,time):
for model in self.models:
model.do_step(time)
def get_data_step(self):
data=[]
for model in self.models:
data.append(model.myvalues)
return data
sim=ManyModel()
inits=[] #####list of init_vals
times=[] ####list of times to simulate
for init in intis:
sim.add_model(init)
for time in times:
sim.step(time)
step_data=sim.get_data_step()
到目前为止,我已经尝试通过以下两种方式@ray.remote
在 Model
类 (1) 和类 (2) 上使用带有装饰器的 ray:ManyModel
(1)
############################## (1) ###############
import some_library
import sim_library_with_global_object
@ray.remote
class Model(object):
def __init__(self,init_vals):
#initialize object using some of the global_object from sim_library.
#the Model object have it's own variables not global
def do_step(self,time):
#calculate Model step using the global_object from sim_library
#edit the Model variables with respect to the step
class ManyModel(object):
def init(self):
self.models=[]
def add_model(self,init_vals):
model = Model.remote(init_vals)
self.model.append(model)
def step(self,time):
futures=[]
for model in self.models:
futures.append(model.do_step.remote(time))
return futures
def get_data_step(self,futures):
data=[]
while len(futures)>0:
ready, not_ready = ray.wait(ids)
results=ray.get(ready)
data.append(results)
return data
ray.init()
sim=ManyModel()
inits=[] #####list of init_vals
times=[] ####list of times to simulate
for init in intis:
sim.add_model(init)
for time in times:
sim.step(time)
step_data=sim.get_data_step()
( 2)
########################## (2) #################
import some_library
import sim_library_with_global_object
class Model(object):
def __init__(self,init_vals):
#initialize object using some of the global_object from sim_library.
#the Model object have it's own variables not global
def do_step(self,time):
#calculate Model step using the global_object from sim_library
#edit the Model variables with respect to the step
@ray.remote
class ManyModel(object):
def init(self):
self.models=[]
self.data=[]
def add_model(self,init_vals):
model = Model(init_vals)
self.model.append(model)
def step(self,time):
for model in self.models:
model.do_step(time)
def get_data_step(self):
self.data=[]
for model in self.models:
self.data.append(model.myvalues)
return self.data
ray.init()
sim=ManyModel.remote()
inits=[] #####list of init_vals
times=[] ####list of times to simulate
for init in intis:
sim.add_model.remote(init)
for time in times:
sim.step.remote(time)
future=sim.get_data_step.remote()
step_data=ray.get(future)
在这两种方式中,我都没有从使用 ray 库中获得任何好处。你能帮我使用吗?
方法(1) 的更新第一种方法的问题是我收到了这个警告信息
2020-11-09 11:33:20,517 WARNING worker.py:1779 -- WARNING: 12 PYTHON workers have been started. This could be a result of using a large number of actors, or it could be a consequence of using nested tasks (see https://github.com/ray-project/ray/issues/3644) for some a discussion of workarounds.
使用 10 xModel
这是性能结果: 不使用射线:10 x Model
-> do_step
0.11 [s] 使用射线 (1):10 x Model
-> do_step
0.22 [s]
此外,每次我使用方法 (1) 创建一个 Actor 时,它都会复制导入库的所有 global_objects 并且 ram 消耗变得疯狂。我需要用超过 10 万个Model
对象进行午餐模拟。
总的来说,我不明白在 ray 中创建多个演员是否是个好主意。
解决方案
放大一些核心元素
ray.init()
sim=ManyModel.remote()
for time in times:
sim.step.remote(time)
future=sim.get_data_step.remote()
step_data=ray.get(future)
最重要的一点是您只创建了一个 Ray actor(在 line 中sim=ManyModel.remote()
)。Ray Actor 按顺序执行提交给它们的任务(默认情况下),因此创建一个 Actor 不会为并行性创造任何机会。要获得与 Ray Actor 的并行性,您需要创建和使用多个 Actor。
第二点是您ray.get
在 for 循环内部调用。这意味着在 for 循环的每次迭代中,您都在提交一个任务,然后调用ray.get
which 等待它完成并检索结果。相反,您将希望提交多个任务(可能在循环内),然后ray.get
在循环外调用。
推荐阅读
- angular - Angular 6:不能 macht 路线(嵌套)
- python - 如何在juypter中获取图像ID
- oracle - 套接字 Oracle 11.0.2.0.1 中没有更多数据要读取
- java - 试图用空格分割用户输入字符串
- wordpress - WordPress 自定义帖子类型未显示
- java - 弹性搜索:通过 RestHighLevelClient 连接时出现 java.net.ConnectException
- spring - Spring Boot jpa 从 db 中选择数百万条记录并处理数据
- node.js - 在 node.js 中使用 Request v2.88.0 进行摘要式身份验证
- database - 将矩阵导出到 Microsoft Access。使用数据库/插入时出错
- css - 从 CSS 动画的中途平滑过渡到开始(或结束)