python - Dataframes getting chucked off in ThreadPoolExecutor Throwing Multiple random Errors
问题描述
I am running a flask application with Multithreading using ThreadPoolExecutor. In this algorithm I have 18 different classes imported which correspond to different use cases. The input (a datframe usually 20 - 30 rows) from users need to be processed through methods from these classes. Following is a sample snippet from the route
def run_methods(use_case_class):
output_df, output_dict = use_case_class.method_computation()
return output_df, output_dict
@app.route('/post_request', methods=['POST'])
def func():
use_case1 = Class1(arg1,arg2)
use_case2 = Class2(arg1,arg2)
use_cases_list = [use_case1,use_case2]
with ThreadPoolExecutor(max_workers=10) as executor:
final_output = executor.map(run_methods, use_cases_list)
Now here, I have one object and multiple methods which ideally is a case of multiprocessing, but since the processes take quite some time to spin off, I am not able to use it. My target is to complete all computations within 700 ms. Hence, I tried using multithreading. now the internal structure of the classes is very similar, meaning the workflow , the names of methods and the objects used. So what is happening is randomly I am getting multiple errors. It seems , multithreading is breaking off objects intermittently.
The error is traced backed to operations like following:
Ex:1
df.loc[:, ['var1','var2','var3','var4']].fillna(0, inplace=True)
Ex:2
varlist = [item for sublist in list(self.dict_input.values()) for item in sublist]
for var1 in varlist:
df1[var1] = df1[var1].astype(str)
Ex: 3
df['col1'] = df[self.vars_list[0]].str.cat(df[[var for var in varlist1 if self.vars_list[0] not in var]],sep='-')
Errors:
For Ex 1 and 2 - Assertion error - Gaps in blk ref_locs For Ex 3 - TypeError: Concatenation requires list-likes containing only strings (or missing values). Offending values found in column floating
I tried changing the variable names as well in one of the classes, but not useful.
Can anyone please help ? I am completely stuck on this issue.
Thanks in Advance.
解决方案
推荐阅读
- php - 使用 LAMP 作为我的游戏的服务器组件可行吗?
- javascript - 带有 Node.js 的 Socket.io
- mysql - MariaDB 10.3.17 服务器在没有非常具体的原因的情况下拒绝连接
- javascript - 在 Google Apps 脚本中通过引用在函数之间传递变量
- tensorflow - 张量流中3-D张量中最大值的索引?
- python - 这是不好的做法吗:分配一个返回变量
- javascript - 仅在 JQuery 中有 2 个同名时更改第一个属性
- azure - 我们如何才能找到在 Azure 中使用 kubernetes pod 所产生的成本,特别是订阅?
- javascript - 我如何在 react-router-dom 中像 react-navigation 一样获得“previous”?
- elasticsearch - heartbeat、metricbeat 和 Elasticsearch 7.5 的邮件配置