首页 > 解决方案 > Dataframes getting chucked off in ThreadPoolExecutor Throwing Multiple random Errors

问题描述

I am running a flask application with Multithreading using ThreadPoolExecutor. In this algorithm I have 18 different classes imported which correspond to different use cases. The input (a datframe usually 20 - 30 rows) from users need to be processed through methods from these classes. Following is a sample snippet from the route

def run_methods(use_case_class):
    output_df, output_dict = use_case_class.method_computation()
    return output_df, output_dict

@app.route('/post_request', methods=['POST'])
def func():
    use_case1 = Class1(arg1,arg2)
    use_case2 = Class2(arg1,arg2)
    use_cases_list = [use_case1,use_case2]

    with ThreadPoolExecutor(max_workers=10) as executor:
        final_output = executor.map(run_methods, use_cases_list)

Now here, I have one object and multiple methods which ideally is a case of multiprocessing, but since the processes take quite some time to spin off, I am not able to use it. My target is to complete all computations within 700 ms. Hence, I tried using multithreading. now the internal structure of the classes is very similar, meaning the workflow , the names of methods and the objects used. So what is happening is randomly I am getting multiple errors. It seems , multithreading is breaking off objects intermittently.

The error is traced backed to operations like following:

Ex:1

df.loc[:, ['var1','var2','var3','var4']].fillna(0, inplace=True)

Ex:2

varlist = [item for sublist in list(self.dict_input.values()) for item in sublist]
for var1 in varlist:
   df1[var1] = df1[var1].astype(str)

Ex: 3

df['col1'] = df[self.vars_list[0]].str.cat(df[[var for var in varlist1 if self.vars_list[0] not in var]],sep='-')

Errors:

For Ex 1 and 2 - Assertion error - Gaps in blk ref_locs For Ex 3 - TypeError: Concatenation requires list-likes containing only strings (or missing values). Offending values found in column floating

I tried changing the variable names as well in one of the classes, but not useful.

Can anyone please help ? I am completely stuck on this issue.

Thanks in Advance.

标签: pythonthreadpoolexecutorconcurrent.futures

解决方案


推荐阅读