首页 > 解决方案 > 代码片段中的 SettingWithCopyWarning 原因

问题描述

在处理一些医学训练数据以训练不同医学测试的分类器时,我从 pandas 获得了 SettingWithCopyWarning。我已经阅读过它并发现它来自对 DataFrame 的链式索引,但是我无法弄清楚我在下面的代码中使用链式索引的地方。

#Turn the 12 measurement result rows of each patient into one single row for each patient
#the different test results are named: result, result_1, ... , result_11 (for each result)
#the pid and age column is kept only once while all other column fields of the 12 measurement rows are concatenated
#into one single row, also the time field exists now 12 times per row

imputed_features.sort_values(by=['pid','Time'], inplace=True)
sorted_features = train_features.sort_values(by=['pid','Time'])
measurements = []
columns = []
for i in range(12):
    measurements.append(imputed_features.groupby(['pid'], as_index=False).nth(i))
    measurements[i].reset_index(drop=True, inplace=True)
    if( i == 0 ):
        columns = [i for i in measurements[i].columns]
    else:
        measurements[i].drop(['pid', 'Age'], axis=1, inplace=True)
        for j in measurements[i].columns:
            columns.append(f'{j}_{i}')

#the resulting aggregated_features DataFrame
aggregated_features = pd.concat(measurements[0:12], axis=1, ignore_index=True)
aggregated_features.columns = columns
aggregated_features.to_csv('aggregated_features.csv', index=False)

标签: pythonpandasdataframe

解决方案


我认为这是因为您指定了一行。

pd.concat(measurements, ....)

如果您仍然收到警告,也许复制和合并“测量”会改善它?

measures = measurements.copy() pd.concat(measures[0:12], ...)


推荐阅读